From ianb at colorstudy.com Sat Mar 3 02:17:33 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 02 Mar 2007 19:17:33 -0600 Subject: [Web-SIG] PasteDeploy comments In-Reply-To: References: Message-ID: <45E8CCAD.2050008@colorstudy.com> Jim Fulton wrote: > I don't remember if we decided that these would be sent to just you or > to the Web SIG. Since I didn't see any messages go to the Web SIG, I'll > assume we're just supposed to send these to you. I suppose we could take this to Web-SIG. For those who weren't at the PyCon mini-meeting we had, we talked about creating a cross-framework application server. Basically the thing that deals with PID files, chuser, parts of connection handling, etc. I don't think we've written up anything yet, but hopefully some people who were taking notes can expand. Or... something. Anyway, we talked about using Paste Deploy entry points for configuration. > - I think you were a bit uncomfortable about the use of the > global_config argument to the factory functions. I share this > discomfort a bit. It seems a little odd to expose the configuration > mechanism this much. It isn't a big deal for me. > > What have you used global configuration data for? It's often meant for configuration that applies to many components. For instance, a "debug" value that applies widely (or could also be applied locally). Or information about where to email errors, some logging information, etc. E.g., you might give a base directory for logging in global_conf, and an application could pick that up and probably put it in a subdirectory there (where if you configured it locally, you'd probably give the application the full path of the log file). > - The semantics of paste.server_factory seem to be a little unclear. In > particular, I *assume* that the return value is expected to block when > run. Is this true? If so, then it makes it hard to have more than one > server. I know that you aren't fond of the idea of having multiple > servers, but a lot of other folks seem to want it. :) In any case, the > semantics of the return value need to be documented. paste.server_factory should be expanded, in part for what you are proposing (starting multiple servers). Also, it seems like there should be a better way to shut it down than killing the entire process. For instance, for performance testing. For multiple servers, I'd generally rather have servers support multiple sockets, though this is a little hard in Paste Deploy (you'd might have to use a set of prefixes for configuring each, if you have configuration that is port-specific). But I don't think there's anything wrong with starting multiple servers, if you really are starting truly different servers. This could all be done in the same entry point, with optional methods (instead of just __call__ being specified), or a new entry point (which might be a bit more explicit). > - If multiple servers are supported, then there will need to be a way to > specify which applications are used with which servers. As long as the connection data is there, you can dispatch later (if you want to at all). For instance, most people want http and https to serve the same application. In paste.urlmap configuration I allow things like (in addition to path dispatch): domain foo = foo_app port 443 = https_app domain bar port 8888 = test_app But you can also easily send everything to the same place, or a group of things to the same place. I find this generally more convenient than building dispatch any further down. Arguably the config syntax could support urlmap more natively. E.g., allow sections like [app:/blog]. This could be turned into urlmap construction. Assuming you don't care about the order in which middleware is applied, you could have [filter:/blog] automatically wrap that application. (With multiple middleware on the same location, I suppose you'd have to supply some qualifier.) > Overall, PasteDeploy looks very usable. I'll probably find other issues > when I actually try to use it. One of my next projects wil be to look > at how to use it in Zope. zope.paste is a bit too much of a wedge. zope.paste, as I remembered it, didn't really seem to allow things like instantiating multiple Zope applications. But I can't remember. And that's not always feasible; Zope 2 is unlikely to really support many truly separate instantiated applications, but it could still support the basic configuration. Also note that in practice usually an application presents the entry point directly, and the framework provides functions to make application-specific entry points easy to write. > On a related note, I'll probably want to do process configuration in the > same file that that PasteDeploy uses. This would likely include things > like: > > - interrupt-check-interval > > - Log files > > I guess there is nothing to prevent this. I suspect that I'll also get > a lot of resistence to moving this out of zope.conf. :/ Yes, the container configuration. (Incidentally, what exactly do we call this thing we're proposing to make?) > Have you tried pointing logging.fileConfig at a cnfig file containing > PasteDeplot sections? I assume it would work. I haven't tried it, but I think Ben Bangert has started work on that, using global_conf['__file__'] that way. A more cohesive logging story that included that would be nice. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From chad at zetaweb.com Sat Mar 3 04:29:27 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Fri, 02 Mar 2007 22:29:27 -0500 Subject: [Web-SIG] more comments on Paste Deploy Message-ID: <45E8EB97.6090805@zetaweb.com> All, Thanks, Jim and Ian, for bringing this discussion online. I have two hesitations with Paste Deploy: 1. The configuration syntax is really complex. I'm much more comfortable with multiple simpler config files. 2. I'm not clear on how Paste Deploy's abstractions map to the filesystem. What does my website root look like? With Aspen, I went with a well-defined filesystem layout (a Unix-style userland) and multiple configuration files (in etc/), each with their own simple syntax. So if you publish a blog app called SuperBlog, let's say, you would mount it in etc/apps.conf, e.g.: / myapp:root /blog superblog:main SuperBlog would configure itself with etc/superblog.conf, a file with a simple syntax described in your SuperBlog documentation. SuperBlog also has access to Aspen's global config through a simple API. I suggest that a system with multiple simple config files is much more scalable than a single complex config file syntax. Imagine if all of Unix were configured using a single syntax! Also, I don't think we should underestimate the importance of the file/executable distinction. A standard "file format" for a website enables a wider tool ecosystem to evolve: interactive shells, debuggers, test runners, skel systems, configuration UIs. It also makes any given website easier to comprehend and maintain. So in short, I give Paste Deploy a -1 as our main configuration system. I'd like the first-line config to be much simpler, with Paste Deploy available as an optional extra. chad From chad at zetaweb.com Sat Mar 3 14:09:23 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 03 Mar 2007 08:09:23 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E8EB97.6090805@zetaweb.com> References: <45E8EB97.6090805@zetaweb.com> Message-ID: <45E97383.9090905@zetaweb.com> > A standard "file format" for a website enables a wider tool > ecosystem to evolve: interactive shells, debuggers, test > runners, skel systems, configuration UIs. Not to mention existing tools like workingenv, distutils, ... From jim at zope.com Sat Mar 3 16:04:51 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 3 Mar 2007 10:04:51 -0500 Subject: [Web-SIG] PasteDeploy comments In-Reply-To: <45E8CCAD.2050008@colorstudy.com> References: <45E8CCAD.2050008@colorstudy.com> Message-ID: On Mar 2, 2007, at 8:17 PM, Ian Bicking wrote: > Jim Fulton wrote: >> What have you used global configuration data for? > > It's often meant for configuration that applies to many > components. For instance, a "debug" value that applies widely (or > could also be applied locally). Or information about where to > email errors, some logging information, etc. E.g., you might give > a base directory for logging in global_conf, and an application > could pick that up and probably put it in a subdirectory there > (where if you configured it locally, you'd probably give the > application the full path of the log file). I know what it's meant for. I was asking what it was actually *used* for. Is this truly useful? > >> - The semantics of paste.server_factory seem to be a little >> unclear. In particular, I *assume* that the return value is >> expected to block when run. Is this true? If so, then it makes >> it hard to have more than one server. I know that you aren't >> fond of the idea of having multiple servers, but a lot of other >> folks seem to want it. :) In any case, the semantics of the return >> value need to be documented. > > paste.server_factory should be expanded, in part for what you are > proposing (starting multiple servers). Cool > Also, it seems like there should be a better way to shut it down > than killing the entire process. For instance, for performance > testing. This doesn't seem important to me. ... >> Overall, PasteDeploy looks very usable. I'll probably find other >> issues when I actually try to use it. One of my next projects wil >> be to look at how to use it in Zope. zope.paste is a bit too much >> of a wedge. > > zope.paste, as I remembered it, didn't really seem to allow things > like instantiating multiple Zope applications. But I can't > remember. And that's not always feasible; Zope 2 is unlikely to > really support many truly separate instantiated applications, but > it could still support the basic configuration. zope.paste tries very hard to minimize its impact on zope configuration. It has to make a number of compromises to do this. It is impossible to run "truly separate" Python applications in the same process, for some definition of "truly separate" and "application". separate WSGI applications will share common module definitions and shared module globals. I can easily imagine separate Zope (2 & 3) applications that exposed separate object spaces or sets of procedural (as opposed to object-based) pages. >> On a related note, I'll probably want to do process configuration >> in the same file that that PasteDeploy uses. This would likely >> include things like: >> - interrupt-check-interval >> - Log files >> I guess there is nothing to prevent this. I suspect that I'll >> also get a lot of resistence to moving this out of zope.conf. :/ > > Yes, the container configuration. (Incidentally, what exactly do > we call this thing we're proposing to make?) I'm not sure we're initially proposing to make *a* thing. For starters I think we're exploring using the PasteDeploy-defined frameworks and to collaborate on sever testing. I would call this the main program, but maybe other terms would be better. >> Have you tried pointing logging.fileConfig at a cnfig file >> containing PasteDeplot sections? I assume it would work. > > I haven't tried it, but I think Ben Bangert has started work on > that, using global_conf['__file__'] that way. A more cohesive > logging story that included that would be nice. I think this should be done by the main program (container/whatever) not by an application. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sat Mar 3 16:21:28 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 3 Mar 2007 10:21:28 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E8EB97.6090805@zetaweb.com> References: <45E8EB97.6090805@zetaweb.com> Message-ID: <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> I'll respond in a high-level way. I believe, we're evaluating Paste Deploy at 2 levels: 1. Can we agree on a standard set of entry points so that WSGI applications can be combined automatically? I think Paste Deploy provides at least good start on this. 2. Do we want to reuse it's configuration syntax. You haven't commented on the entry points defined by Paste Deploy. Do you have an opinion on adopting the entry-point API defined by Paste Deploy? On the subject of configuration format, I suppose this is a matter of taste. I strongly prefer having fewer configuration files, preferably one. One of the things I like about zc.buildout is that it lets me gather my configuration in one file. The configuration format used by Paste Deploy is a simple standard format used by many many systems inside and outside the Python community. This makes it easy for people to learn and understand. Obviously, we can agree to disagree on this. I'd very much like, at a minimum, to agree on the entry point API so we can more easily collaborate on interoperable applications, middlewear, and servers. Jim On Mar 2, 2007, at 10:29 PM, Chad Whitacre wrote: > All, > > Thanks, Jim and Ian, for bringing this discussion online. > > I have two hesitations with Paste Deploy: > > 1. The configuration syntax is really complex. I'm much more > comfortable with multiple simpler config files. > > 2. I'm not clear on how Paste Deploy's abstractions map to the > filesystem. What does my website root look like? > > > With Aspen, I went with a well-defined filesystem layout (a > Unix-style userland) and multiple configuration files (in etc/), > each with their own simple syntax. > > So if you publish a blog app called SuperBlog, let's say, you > would mount it in etc/apps.conf, e.g.: > > / myapp:root > /blog superblog:main > > SuperBlog would configure itself with etc/superblog.conf, a file > with a simple syntax described in your SuperBlog documentation. > SuperBlog also has access to Aspen's global config through a > simple API. > > I suggest that a system with multiple simple config files is much > more scalable than a single complex config file syntax. Imagine > if all of Unix were configured using a single syntax! > > > Also, I don't think we should underestimate the importance of the > file/executable distinction. A standard "file format" for a > website enables a wider tool ecosystem to evolve: interactive > shells, debuggers, test runners, skel systems, configuration UIs. > It also makes any given website easier to comprehend and maintain. > > > So in short, I give Paste Deploy a -1 as our main configuration > system. I'd like the first-line config to be much simpler, with > Paste Deploy available as an optional extra. > > > > > chad > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/jim% > 40zope.com -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sat Mar 3 16:42:10 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 3 Mar 2007 10:42:10 -0500 Subject: [Web-SIG] My summary of a web-platform Open-Space discussion at PyCon 2007 Message-ID: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com> I'll summarize my recollections of a very useful discussion that several of us had at PyCon 2007. At PyCon, Chad Whitacre gathered a a number of us for an Open Space discussion at PyCon to discuss how we might collaborate on common infrastructure at "below WSGI". As I understood this, this included thing like: - WSGI application assembly - Main programs - Process management tools - Daemon start, stop, status, etc. - Signal handling - Log rotation - Etc. I managed to add: - Server benchmarks Maybe there were other things in scope that I forgot. We should have appointed a secretary. :) I think we decided on some immediate actions: - Give Ian feedback on Paste Deploy - Ian will lead a server benchmark effort In addition, I think there is interest in coming up with best practices for daemon and Windows service management. I don't think there were specific action items. A few tools were mentioned. (I'll send a separate brief note on my ideas about this). My impression is that there isn't a lot of appetite for standardizing on a common pain application. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Sat Mar 3 17:08:24 2007 From: jim at zope.com (Jim Fulton) Date: Sat, 3 Mar 2007 11:08:24 -0500 Subject: [Web-SIG] daemon tools Message-ID: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> For some time, Zope has used a daemon-management tool we wrote called zdaemon: http://www.python.org/pypi/zdaemon Until late last year, I found this tool a bit difficult to use because it was essentially undocumented. I was forced to learn enough to mostly document it and have gained a new appreciation of it. (I haven't documented its interactive shell mode, which I don't use. Maybe someone will document it or maybe I'll just rip it out.) I considered making some enhancements to it and decided to ask if some folks knew about alternative tools we might use instead. See the discussion at: http://mail.zope.org/pipermail/zope3-dev/2006-December/021353.html Ironically, this sort of tool isn't Python specific at all, and the discussion highlighted some non-Python tools, notably daeomontools and runit, neither of which seemed as appealing as zdaemon for various reasons. This discussion also noted a Python-based tool named suoervisor2: http://www.plope.com/software/supervisor2/ Which seems to be derived from zdaemon and has some interesting features. I think that both zdaemon and supervisor3 do a better job of process management than daemontools or runit. At the recent open-space discussion, another Python-based tool was mentioned whos name I don't remember. I ended up deciding to use zdaemon for our projects because it met our needs very well. I added a couple of enhancements: - The ability to set environment variables. This is really important to us as it allows us to set LD_LIBRARY_PATH. This wants to be done in a supervisor process. A Python program can't set LD_LIBRARY_PATH for itself because it is too late for it to be used by the library loaded. - I finished the transcript log, making it rotatable. The zdaemon transcript log consumes the standard error and output of the program zdaemon manages, providing basic logging for applications that have lacking or lame logging support. (zdaemon has allowed us to make the spread daemon far more manageable.) Anyway, I share this for your consideration. There are probably better tools out there than zdaemon and supervisor2, but I'm not aware of them. :) I'm curious what other people have found or use. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From chad at zetaweb.com Sat Mar 3 17:09:37 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 03 Mar 2007 11:09:37 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> Message-ID: <45E99DC1.4010703@zetaweb.com> Jim, Thanks for the reply. > 2. Do we want to reuse its configuration syntax. -1 > The configuration format used by Paste Deploy is a simple > standard format used by many many systems inside and outside > the Python community. I'm not objecting to the general ini-style format (do I read you right?), but rather to the overloaded section names, the URI/name syntax, the 'set' prefix, composite applications, etc. Paste Deploy layers a whole mini-language on top of the ini format. > Obviously, we can agree to disagree on this. Sure, as long as Paste Deploy's config syntax is optional for whatever-we're-building. :^) > 1. Can we agree on a standard set of entry points so that WSGI > applications can be combined automatically? I think Paste > Deploy provides at least good start on this. > > You haven't commented on the entry points defined by Paste > Deploy. Do you have an opinion on adopting the entry-point API > defined by Paste Deploy? Ok, I need help: defining an entry point allows a plugin to advertise that it can satisfy that entry point, but you still need a configuration layer to actually wire it up, no? In which case: 1) What does "automatically" mean? 2) Aren't we back to discussing config syntax? chad From chad at zetaweb.com Sat Mar 3 17:12:48 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 03 Mar 2007 11:12:48 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> Message-ID: <45E99E80.7050800@zetaweb.com> > Anyway, I share this for your consideration. There are probably > better tools out there than zdaemon and supervisor2, but I'm not > aware of them. :) I'm curious what other people have found or use. There's also monit: http://www.tildeslash.com/monit/ chad From chad at zetaweb.com Sat Mar 3 17:18:41 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 03 Mar 2007 11:18:41 -0500 Subject: [Web-SIG] My summary of a web-platform Open-Space discussion at PyCon 2007 In-Reply-To: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com> References: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com> Message-ID: <45E99FE1.1090307@zetaweb.com> Jim, > I'll summarize my recollections of a very useful discussion > that several of us had at PyCon 2007. Looks accurate to me, thanks. > - Ian will lead a server benchmark effort Where by "server," we mean core HTTP server library, yes? > My impression is that there isn't a lot of appetite for > standardizing on a common pain application. Sorry, "pain application?" :^) I assume you mean a common app server executable, as opposed to best practice docs, entry point standards, maybe even libraries, etc. Yes? chad From fumanchu at amor.org Sat Mar 3 20:05:12 2007 From: fumanchu at amor.org (Robert Brewer) Date: Sat, 3 Mar 2007 11:05:12 -0800 Subject: [Web-SIG] more comments on Paste Deploy References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> Message-ID: <435DF58A933BA74397B42CDEB8145A86224D4C@ex9.hostedexchange.local> Jim Fulton wrote: > I believe, we're evaluating Paste Deploy at 2 levels: > 1. Can we agree on a standard set of entry points so > that WSGI applications can be combined automatically? > I think Paste Deploy provides at least good start on this. Yes, I think we can. And the ones in paste deploy are a good start (and end, for all I know). But if Ian's going to split Paste Deploy out into its own project (as he hinted), we should find a new namespace for them besides 'paste.*' soon. > 2. Do we want to reuse it's configuration syntax. > On the subject of configuration format, I suppose this > is a matter of taste. I strongly prefer having fewer > configuration files, preferably one. In my head, we share a 'site daemon' among us, and a common 'webctl' front end to that daemon should use a single INI-style config file (but like Chad, I'm not sold on Paste's existing format). However, we should build the site daemon in such a way that each framework can drive it in framework-specific way, and if they wanted to layer their own config style on top of that interface, fine. This would make it easier for the various framework authors and users to explore tutorials, run tests, and deploy single-framework sites. In short, I'm pushing for: read conf -> apply conf -> del conf -> work with objects as opposed to the much more tightly-coupled and hard-to-use: read conf -> work with a mix of conf and objects forever Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20070303/f911d6c2/attachment.html From lcrees at gmail.com Sat Mar 3 20:21:53 2007 From: lcrees at gmail.com (L.C. Rees) Date: Sat, 3 Mar 2007 12:21:53 -0700 Subject: [Web-SIG] more comments on Paste Deploy Message-ID: <3ce244090703031121h578bdaa2g8a7a1375b45c60e1@mail.gmail.com> > Sure, as long as Paste Deploy's config syntax is optional for > whatever-we're-building. :^) Some of the pain and angst over choosing one solution to the WSGI application composition problem could be treated by dividing the composition process into (at least) three parts: 1. Configuration parsing Configuration information is read from multiple files or one big file all at once (something ConfigParser in the standard library, for example, already has support for) or selectively. The information, stored in whatever format (INI, Python, even XML, pick your poison), is parsed (with optionally validation) into a uniform internal Python format. The internal format would be a sequence of tuples. Each tuple would contain three elements: a. An identifier consisting of tuple that contains two elements, an (optional) qualifying prefix and a more specific identifier. b. Configuration parameters that have been parsed into a tuple of positional arguments. c. Configuration parameters that have been parsed into a dictionary of keyword arguments. 2. Dispatching A dispatcher would take the sequence of tuples from the parser and resolve the identifier to an adapter. The dispatcher would then strip out the identifier, and pass a tuple containing the tuple of positional arguments the dictionary of keyword arguments to the adapter. Different identifier schemes could be accommodated by the same dispatcher as needed. 3. Adapting The adapter would be responsible for taking the configuration data in the tuple passed to it by the dispatcher and returning a configured WSGI application. An approach that decomposes the WSGI application composition process into distinct stages would accommodate different approaches to each stage of the composition process while allowing interoperability similar to how WSGI allows heterogeneous Python web applications to live together in (greater) peace and harmony-lcr From ianb at colorstudy.com Sat Mar 3 21:39:37 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 14:39:37 -0600 Subject: [Web-SIG] daemon tools In-Reply-To: <45E99E80.7050800@zetaweb.com> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> <45E99E80.7050800@zetaweb.com> Message-ID: <45E9DD09.8030605@colorstudy.com> Chad Whitacre wrote: >> Anyway, I share this for your consideration. There are probably >> better tools out there than zdaemon and supervisor2, but I'm not >> aware of them. :) I'm curious what other people have found or use. > > There's also monit: > > http://www.tildeslash.com/monit/ I think monit overlaps some with supervisor2's featureset, but not as much with zdaemon. Having monit poll your process to check it's alive isn't as solid a solution as having a real parent process to do that. Monit would still be useful with zdaemon, because it can poll things like HTTP responses, memory usage, etc. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Sat Mar 3 21:40:59 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 14:40:59 -0600 Subject: [Web-SIG] PasteDeploy comments In-Reply-To: References: <45E8CCAD.2050008@colorstudy.com> Message-ID: <45E9DD5B.2070900@colorstudy.com> Jim Fulton wrote: > > On Mar 2, 2007, at 8:17 PM, Ian Bicking wrote: > >> Jim Fulton wrote: >>> What have you used global configuration data for? >> >> It's often meant for configuration that applies to many components. >> For instance, a "debug" value that applies widely (or could also be >> applied locally). Or information about where to email errors, some >> logging information, etc. E.g., you might give a base directory for >> logging in global_conf, and an application could pick that up and >> probably put it in a subdirectory there (where if you configured it >> locally, you'd probably give the application the full path of the log >> file). > > I know what it's meant for. I was asking what it was actually *used* > for. Is this truly useful? Well, for some things like the debug setting, definitely. That is, *some* applications consume that value, but not all, and in the form of global_conf the value just sort of hangs out without being applied to anything in particular. In deployments where I'm using a set of applications designed to work together I've found it useful to pass values to all of the applications at once. Also when you pass values in through the command-line it gets put into global_conf, because it's not clear what section it would otherwise apply to (since the application you are intending to effect may be wrapped by middleware). >> Also, it seems like there should be a better way to shut it down >> than killing the entire process. For instance, for performance testing. > > This doesn't seem important to me. Really what I'd like it for is testing, in those times when I really want to start up a real HTTP server to test against, then cleanly shut it down. >>> Overall, PasteDeploy looks very usable. I'll probably find other >>> issues when I actually try to use it. One of my next projects wil be >>> to look at how to use it in Zope. zope.paste is a bit too much of a >>> wedge. >> >> zope.paste, as I remembered it, didn't really seem to allow things >> like instantiating multiple Zope applications. But I can't remember. >> And that's not always feasible; Zope 2 is unlikely to really support >> many truly separate instantiated applications, but it could still >> support the basic configuration. > > zope.paste tries very hard to minimize its impact on zope > configuration. It has to make a number of compromises to do this. It > is impossible to run "truly separate" Python applications in the same > process, for some definition of "truly separate" and "application". > separate WSGI applications will share common module definitions and > shared module globals. I can easily imagine separate Zope (2 & 3) > applications that exposed separate object spaces or sets of procedural > (as opposed to object-based) pages. "Separate" instances of applications is a fairly vague notion, that only means something when applied specifically. I would hope that you could start two Zope apps pointing at different ZODB instances, just like you should be able to start two apps pointing to different objects in the same ZODB. >>> On a related note, I'll probably want to do process configuration in >>> the same file that that PasteDeploy uses. This would likely include >>> things like: >>> - interrupt-check-interval >>> - Log files >>> I guess there is nothing to prevent this. I suspect that I'll also >>> get a lot of resistence to moving this out of zope.conf. :/ >> >> Yes, the container configuration. (Incidentally, what exactly do we >> call this thing we're proposing to make?) > > I'm not sure we're initially proposing to make *a* thing. For starters I > think we're exploring using the PasteDeploy-defined frameworks and to > collaborate on sever testing. > > I would call this the main program, but maybe other terms would be better. > >>> Have you tried pointing logging.fileConfig at a cnfig file containing >>> PasteDeplot sections? I assume it would work. >> >> I haven't tried it, but I think Ben Bangert has started work on that, >> using global_conf['__file__'] that way. A more cohesive logging story >> that included that would be nice. > > I think this should be done by the main program (container/whatever) not > by an application. In the case of Paste and Pylons, we wanted to add a bunch of logging to the library. The library at that point doesn't belong to any application. Having a bunch of logging without a clear story about how to use that logging seemed bad (in this case it's mostly logging intended for programmers, not final deployment, but some portions could be useful in final deployment). It could (and probably would) be applied as an outer middleware applied by individual applications, but ideally there would be shared conventions across frameworks. Ideally it would also make application-specific logging easier. I think logging configuration is a general use case we should consider, but I don't think it's part of the container really. It might relate to something in Paste Deploy configuration. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Sat Mar 3 21:44:57 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 14:44:57 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> Message-ID: <45E9DE49.1010801@colorstudy.com> Jim Fulton wrote: > I'll respond in a high-level way. > > I believe, we're evaluating Paste Deploy at 2 levels: > > 1. Can we agree on a standard set of entry points so that WSGI > applications can be combined automatically? I think Paste Deploy > provides at least good start on this. > > 2. Do we want to reuse it's configuration syntax. Yes, I hope people will look at these separately. The entry points provide a consistent way to get at middleware and applications. I've been careful to not expose the actual configuration file to applications, and I like that. It makes it possible to discuss these separately. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Sat Mar 3 21:54:41 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 14:54:41 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E8EB97.6090805@zetaweb.com> References: <45E8EB97.6090805@zetaweb.com> Message-ID: <45E9E091.3070603@colorstudy.com> Chad Whitacre wrote: > All, > > Thanks, Jim and Ian, for bringing this discussion online. > > I have two hesitations with Paste Deploy: > > 1. The configuration syntax is really complex. I'm much more > comfortable with multiple simpler config files. Is it really that complex? There's a few too many ways to do middleware around applications, I'm afraid. get/set is really a rather obscure feature that I seldom use. The distinction between "composite" and "app" isn't necessary, I think. The ability to inherit from sections is really useful IMHO (though not well described in documentation); that's where you do something like "use = other_section", and then add settings that override that other section's settings. > 2. I'm not clear on how Paste Deploy's abstractions map to the > filesystem. What does my website root look like? > > > With Aspen, I went with a well-defined filesystem layout (a > Unix-style userland) and multiple configuration files (in etc/), > each with their own simple syntax. > > So if you publish a blog app called SuperBlog, let's say, you > would mount it in etc/apps.conf, e.g.: > > / myapp:root > /blog superblog:main > > SuperBlog would configure itself with etc/superblog.conf, a file > with a simple syntax described in your SuperBlog documentation. > SuperBlog also has access to Aspen's global config through a > simple API. The way I have generally configured websites like this is like: [composite:main] use = egg:Paste#urlmap / = config:root.ini /blog = config:superblog.ini Then I put root.ini and superblog.ini alongside this configuration file, and each has an [app:main] section. (You can also point to another section in a file, like config:root.ini#other_section) > I suggest that a system with multiple simple config files is much > more scalable than a single complex config file syntax. Imagine > if all of Unix were configured using a single syntax! I think it depends some on the particular case. Paste Deploy lets you do both. For instance, in one case we made a really simple application that just returned a random bit of HTML selected from a specific file full of HTML snippets (used with SSIs). The basic config looked like: [app:random] use = egg:Randomizer file = /path/to/file.html Except we had about 5 of these, and we put them all in one file and then mounted them like: [composite:main] use = egg:Paste#urlmap /random1 = config:random.ini#random1 /random2 = config:random.ini#random2 ... There's other cases where having both options is nice. Because Paste Deploy doesn't fold config files together, you can also reuse them from different contexts. (A more common way to use multiple config files -- what ConfigParser.load supports -- is to just overlap all the sections, usually totally clobbering each other. I like this more explicit way of bringing in configuration, which treats configuration like a composable set of configurations instead of a system where all the configuration files are pretty tightly bound to each other.) > Also, I don't think we should underestimate the importance of the > file/executable distinction. A standard "file format" for a > website enables a wider tool ecosystem to evolve: interactive > shells, debuggers, test runners, skel systems, configuration UIs. > It also makes any given website easier to comprehend and maintain. I'm not sure about the distinction you are making here. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Sat Mar 3 22:06:07 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 15:06:07 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <435DF58A933BA74397B42CDEB8145A86224D4C@ex9.hostedexchange.local> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <435DF58A933BA74397B42CDEB8145A86224D4C@ex9.hostedexchange.local> Message-ID: <45E9E33F.7050604@colorstudy.com> Robert Brewer wrote: > Jim Fulton wrote: > > I believe, we're evaluating Paste Deploy at 2 levels: > > 1. Can we agree on a standard set of entry points so > > that WSGI applications can be combined automatically? > > I think Paste Deploy provides at least good start on this. > > Yes, I think we can. And the ones in paste deploy are a good start (and > end, for all I know). But if Ian's going to split Paste Deploy out into > its own project (as he hinted), we should find a new namespace for them > besides 'paste.*' soon. Well, only if we use the entry points ;). Paste Deploy already supports a couple overlapping entry points. It could support more, or a new system could support those plus some more (I assume even if we implement a new config file format here, I'll add support to Paste Deploy as well for people who don't switch over immediately). I don't think we should add any new names or prefixes until we've solidly settled on what those entry points define. If we want to rename the Paste Deploy entry point groups at that point, that's fine. > > 2. Do we want to reuse it's configuration syntax. > > On the subject of configuration format, I suppose this > > is a matter of taste. I strongly prefer having fewer > > configuration files, preferably one. > > In my head, we share a 'site daemon' among us, and a common 'webctl' > front end to that daemon should use a single INI-style config file (but > like Chad, I'm not sold on Paste's existing format). However, we should > build the site daemon in such a way that each framework can drive it in > framework-specific way, and if they wanted to layer their own config > style on top of that interface, fine. This would make it easier for the > various framework authors and users to explore tutorials, run tests, and > deploy single-framework sites. > > In short, I'm pushing for: > > read conf -> apply conf -> del conf -> work with objects > > as opposed to the much more tightly-coupled and hard-to-use: > > read conf -> work with a mix of conf and objects forever I definitely agree that we shouldn't pass big config objects to applications (or servers or middleware or whatever). I don't really like that global_conf['__file__'] gives you the filename; it's a little vague what it really means when you are nesting several files, and it can encourage hacky things. OTOH, if you want to fold your logging conf in with your app conf, it provides a reasonably easy way to do that I suppose. Anyway, besides one or two ways you can poke through, Paste Deploy mostly does this. Incidentally, one thing Paste Deploy doesn't really allow well is when you have really complicated configuration. For instance, an application like Trac has a big config file with lots of sections. One could argue that it's *too* big, but it is what it is. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Sat Mar 3 22:37:52 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 15:37:52 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <3ce244090703031121h578bdaa2g8a7a1375b45c60e1@mail.gmail.com> References: <3ce244090703031121h578bdaa2g8a7a1375b45c60e1@mail.gmail.com> Message-ID: <45E9EAB0.30201@colorstudy.com> L.C. Rees wrote: >> Sure, as long as Paste Deploy's config syntax is optional for >> whatever-we're-building. :^) > > Some of the pain and angst over choosing one solution to the WSGI > application composition problem could be treated by dividing the > composition process into (at least) three parts: > > 1. Configuration parsing > > Configuration information is read from multiple files or one big file > all at once (something ConfigParser in the standard library, for > example, already has support for) or selectively. The information, > stored in whatever format (INI, Python, even XML, pick your poison), > is parsed (with optionally validation) into a uniform internal Python > format. I don't think we should have any validation in the config format (except for basic syntax, of course). Doing validation is just too hard, and leads to a rather complex config framework. I think some of the problems with ZConfig come back to this. I personally am quite happy with Paste Deploy using straight strings, not Python expressions or anything else that presumes to understand values. > The internal format would be a sequence of tuples. Each tuple > would contain three elements: > > a. An identifier consisting of tuple that contains two elements, an > (optional) qualifying prefix and a more specific identifier. > b. Configuration parameters that have been parsed into a tuple of > positional arguments. > c. Configuration parameters that have been parsed into a dictionary of > keyword arguments. I'm confused here. Can you give an example of what this data would look like for something simple? (E.g., a blog app) How does this different or better than a flat dictionary of strings (which is basically what Paste Deploy provides)? > 2. Dispatching > > A dispatcher would take the sequence of tuples from the parser and > resolve the identifier to an adapter. The dispatcher would then strip > out the identifier, and pass a tuple containing the tuple of > positional arguments the dictionary of keyword arguments to the > adapter. > > Different identifier schemes could be accommodated by the same > dispatcher as needed. I'm not sure what you are describing here. Is this like in Paste Deploy, we strip out the "use" key to find the entry point? > 3. Adapting > > The adapter would be responsible for taking the configuration data in > the tuple passed to it by the dispatcher and returning a configured > WSGI application. > > An approach that decomposes the WSGI application composition process > into distinct stages would accommodate different approaches to each > stage of the composition process while allowing interoperability > similar to how WSGI allows heterogeneous Python web applications to > live together in (greater) peace and harmony-lcr In some ways we can, in some ways we can't. For instance, a config file format that produces integers, lists, etc., is a bit hard to reconcile with a separate format that only produces strings. (If consumers always special-case strings this isn't so bad, but if you get used to getting non-strings you are less likely to do that.) Also, is order relevant? It isn't in dictionaries, but could be in a file format, but probably wouldn't be in a database. We have to come up with some lowest common denominator. And having done that, we can support *some* set of config formats or data sources, but a bunch of formats will seem superfluous, as any added value they might provide will be useless since it can't be relied upon. In this sense, while the entry points can be mostly discussed regardless of the config format, it's not entirely true -- you have to keep at least some set of config formats in your head at the same time as you are discussing the entry points. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Sat Mar 3 22:48:56 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 15:48:56 -0600 Subject: [Web-SIG] PasteDeploy comments In-Reply-To: References: <45E8CCAD.2050008@colorstudy.com> Message-ID: <45E9ED48.3070905@colorstudy.com> Jim Fulton wrote: > > On Mar 2, 2007, at 8:17 PM, Ian Bicking wrote: > >> Jim Fulton wrote: >>> What have you used global configuration data for? >> >> It's often meant for configuration that applies to many components. >> For instance, a "debug" value that applies widely (or could also be >> applied locally). Or information about where to email errors, some >> logging information, etc. E.g., you might give a base directory for >> logging in global_conf, and an application could pick that up and >> probably put it in a subdirectory there (where if you configured it >> locally, you'd probably give the application the full path of the log >> file). > > I know what it's meant for. I was asking what it was actually *used* > for. Is this truly useful? An example that would probably apply to Zope: you have several Zope apps, but they aren't at the "top" of the website. That is, there's some dispatchers and middleware before you get to them. If you want them all to use some common configuration -- stuff like the location of the ZODB -- you might set those values globally, and if the applications specifically picked those up (which I would expect) then that would be convenient. Some configuration values don't make any sense to set globally, and applications can require local settings (or require that there are no extra local settings), so I think the distinction is nice. I initially planned to just fold all the configuration into one set of keywords, but Phillip talked me out of it. It would mean that every application would have to take a bunch of keyword arguments they would ignore (since there might be global settings that didn't apply to them), and they could unintentionally pick up global arguments that only coincidentally matched local settings. Not having *any* global settings would be doable. You'd have to use a lot more of the "get" option that Paste Deploy uses, or maybe if it had an option to draw in the settings from another section (e.g., you'd set up one zodb section and draw in from it in all your apps). You'd have to know where those settings applied, you wouldn't coincidentally get those values, nor would you be as likely to give good site-wide defaults where general defaults were acceptable. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From fumanchu at amor.org Sun Mar 4 00:19:15 2007 From: fumanchu at amor.org (Robert Brewer) Date: Sat, 3 Mar 2007 15:19:15 -0800 Subject: [Web-SIG] daemon tools References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> Message-ID: <435DF58A933BA74397B42CDEB8145A86224D4D@ex9.hostedexchange.local> Jim Fulton wrote: > For some time, Zope has used a daemon-management tool > we wrote called zdaemon: > > http://www.python.org/pypi/zdaemon > > Ironically, this sort of tool isn't Python specific at all, > and the discussion highlighted some non-Python tools, notably > daemontools and runit, neither of which seemed as appealing > as zdaemon for various reasons. The user interface isn't Python-specific, but the interaction with WSGI servers, middleware, applications, and frameworks should be. Components at all levels of the WSGI stack need to interact with "site-wide" events and settings. What I'm envisioning (and writing for CP at the moment) is a framework-neutral, one-per-site Engine object that is basically a publish/subscribe messenger; when you import a Python web framework, it registers listeners for process start, stop, and graceful restart. These would be things that need to happen regardless of the OS process invoker: whether a common 'webctl' script (that we author), or a framework-specific function (like cherrypy.quickstart), or Apache (via mod_python). The pub/sub model also supports plugins with their own channel(s). For example, frameworks would blindly call engine.publish('autoreload.add', filename) as desired. If the invoker (webctl, quickstart, or Apache) plugs in an autoreloader, great; it subscribes to that channel, receives each message, and adds each filename to its list of files to monitor. If no autoreloader has been plugged in, the 'add' message is correctly ignored. And when the autoreloader detects a change, it would also publish 'reload' or 'reexec' messages, which would then be subscribed to by a Reexec plugin. Most of the plugins would be provided by the invoker, but frameworks would be free to use the Engine to register their own events and event listeners. This interface between a site-wide container and the WSGI components is far more important to me than the actual details of invocation (like forking, signal-handling, logging, etc). The latter can be written as Engine plugins, and can compete in a market created by a good "Web Site Engine Interface" spec. Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20070303/be2a01b8/attachment.html From reinoutvanrees at gmail.com Sun Mar 4 00:37:22 2007 From: reinoutvanrees at gmail.com (Reinout van Rees) Date: Sat, 03 Mar 2007 23:37:22 -0000 Subject: [Web-SIG] python.org mailing list memberships reminder In-Reply-To: References: Message-ID: <1172965042.429886.67700@v33g2000cwv.googlegroups.com> On Mar 1, 5:04 am, mailman-ow... at python.org wrote: > This is a reminder, sent out once a month, about your python.org > mailing list memberships. It includes your subscription info and how > to use it to change it or unsubscribe from a list. Just before someone starts messing around with google groups' mailinglist subscription: I just switched off the monthly password reminder by using the password transmitted in that way :-) Reinout From chad at zetaweb.com Sun Mar 4 01:27:30 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 3 Mar 2007 19:27:30 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E9E091.3070603@colorstudy.com> References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com> Message-ID: Ian, Thanks for weighing in. > > 2. I'm not clear on how Paste Deploy's abstractions map to the > > filesystem. What does my website root look like? > > The way I have generally configured websites like this is like: > > [composite:main] > use = egg:Paste#urlmap > / = config:root.ini > /blog = config:superblog.ini Right, that's the configuration, but where is "egg:Paste#urlmap" on the filesystem? Are the three ini files alone in some directory? Where is paste? Where is SuperBlog? Where is the rest of the site? I find it easier to start with the filesystem and then move up into object/config abstractions. > > Also, I don't think we should underestimate the importance of the > > file/executable distinction. A standard "file format" for a > > website enables a wider tool ecosystem to evolve: interactive > > shells, debuggers, test runners, skel systems, configuration UIs. > > It also makes any given website easier to comprehend and maintain. > > I'm not sure about the distinction you are making here. ODT vs. DOC ODS vs. XLS ODP vs. PPT From ianb at colorstudy.com Sun Mar 4 01:54:19 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 18:54:19 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com> Message-ID: <45EA18BB.6030703@colorstudy.com> Chad Whitacre wrote: >> > 2. I'm not clear on how Paste Deploy's abstractions map to the >> > filesystem. What does my website root look like? >> >> The way I have generally configured websites like this is like: >> >> [composite:main] >> use = egg:Paste#urlmap >> / = config:root.ini >> /blog = config:superblog.ini > > Right, that's the configuration, but where is "egg:Paste#urlmap" on > the filesystem? Are the three ini files alone in some directory? Where > is paste? Where is SuperBlog? Where is the rest of the site? I find it > easier to start with the filesystem and then move up into > object/config abstractions. You just have to understand what egg:Paste#urlmap is, probably from some documentation. Admittedly that's boilerplate in the eyes of most people who use it. It's there explicitly because Paste Deploy doesn't build *any* WSGI anything into it, it only composes pieces, one of the most common being urlmap. You can see docs for it with "paster points paste.composite_factory urlmap", though I now notice I haven't written any docs for it (bad of me), and that is hardly a simple command line. I would certainly want to build a command-line help/browser (and probably web one too) as part of a rewrite of the system. The three ini files do go in the same directory, though of course you could do config:superblog/app.ini or something like that if you wanted to set it up differently. It's a relative filename, relative to the file where it is given. The applications themselves are eggs. You install them however you want to install them (of course I'd strongly recommend workingenv, virtual-python, or zc.buildout, but that's a separate concern). Some people have mentioned some frustration about having to build full libraries with a namespace, setup.py, eggs, etc. just to use applications. But I think even pretty modest shops writing very one-off apps gain a real benefit from these patterns, once you get over the initial hump (and we can build tools to make the initial hump not so bad, that's the point of paster create). Anyway, here's one reply I made to that request: http://pythonpaste.org/archives/message/20070215.192041.1534ce27.en.html There's a lot of practices around library management that *has* to be done, because people use libraries. Most of this applies pretty well to applications as well -- and since everyone *needs* to learn how to manage their libraries, using the same mechanisms for managing applications has some advantage. Incidentally, one change to the config format that would make it possible to remove the explicit idea of "composite" apps, is to make some key syntax that will instantiate the named object. E.g.,: app / = config:root.ini Then the keywords passed would just be {"/": }, instead of the current {"/": "config:root.ini"} (where the "config:root.ini" is passed to the loader object that the composite factory gets). >> > Also, I don't think we should underestimate the importance of the >> > file/executable distinction. A standard "file format" for a >> > website enables a wider tool ecosystem to evolve: interactive >> > shells, debuggers, test runners, skel systems, configuration UIs. >> > It also makes any given website easier to comprehend and maintain. >> >> I'm not sure about the distinction you are making here. > > ODT vs. DOC > ODS vs. XLS > ODP vs. PPT Still unclear. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From lcrees at gmail.com Sun Mar 4 02:06:51 2007 From: lcrees at gmail.com (Lynn Rees) Date: Sat, 03 Mar 2007 18:06:51 -0700 Subject: [Web-SIG] more comments on Paste Deploy Message-ID: <45EA1BAB.8090801@gmail.com> > I don't think we should have any validation in the config format > (except for basic syntax, of course). Doing validation is just too > hard, and leads to a rather complex config framework. I think some > of the problems with ZConfig come back to this. I didn't propose that validation be in the config format. I proposed that the configuration parser, whatever config format it's parsing, pass the configuration data it extracts on to the next stage of the WSGI composition process in a standard format. The parser may or may not validate configuration data before it passes it on; decomposing WSGI composition into distinct and modular stages means that the rest of the composition process doesn't have to care. > I personally am quite happy with Paste Deploy using straight strings, > not Python expressions or anything else that presumes to understand > values. I don't disagree but whether to use strings or not is an implementation issue (Paste Deploy) and not a process issue (WSGI application composition). My proposal addressed the process, not the particular implementation. > I'm confused here. Can you give an example of what this data would > look like for something simple? (E.g., a blog app) How does this > different or better than a flat dictionary of strings (which is > basically what Paste Deploy provides)? The message passing format is based on the following premise: ultimately, any configuration of a WSGI component involves 1) locating a Python routine and 2) passing some combination of arguments and/or keywords to it. The format: (identifier, (args), {kwargs}) contains sufficient information to 1) identify a Python routine and 2) pass configuration data to it in a format it's hardwired to handle. Whether a collection of configuration directives for a group of WSGI components is passed on to the next stage of the composition process as a tuple or a dictionary e.g. {identifier1:((args), {kwargs}, identifier2:((args), {kwargs})} is a matter of complete indifference to me. > I'm not sure what you are describing here. Is this like in Paste > Deploy, we strip out the "use" key to find the entry point? The use of "use" and the concept of a distinct dispatching stage are complementary. The dispatcher accesses a map of identifiers to adapters, fetches the adapter matching an identifier, and passes configuration data to it. The identifier could be the value specified by the "use" key. That's an implementation decision. > In some ways we can, in some ways we can't. For instance, a config > file format that produces integers, lists, etc., is a bit hard to > reconcile with a separate format that only produces strings. (If > consumers always special-case strings this isn't so bad, but if you > get used to getting non-strings you are less likely to do that.) What's passed in the args tuple and kwargs dictionary is the internal business of either the configuration parser that kicks the process off and the adapter that receives it at the end. From the point of view of passing the data between stages in the composition process, the type of the container is the only type that matters. Most Python containers are type agnostic and I think that's a good principle to remain faithful to in an interop format. > Also, is order relevant? It isn't in dictionaries, but could be in a > file format, but probably wouldn't be in a database. We have to come > up with some lowest common denominator. And having done that, we can > support *some* set of config formats or data sources, but a bunch of > formats will seem superfluous, as any added value they might provide > will be useless since it can't be relied upon. Since I compose WSGI components by sequentially wrapping one component within another, a sequence is the most natural way to pass WSGI configuration around to me. However, dictionaries are fine with me. I wouldn't necessarily enforce order in a config format per se. However, the point of breaking the composition process into distinct phases is so that I can use whatever config file format I wish and know that the WSGI component I'm configuring will receive the configuration data-lcr From ianb at colorstudy.com Sun Mar 4 02:07:55 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 03 Mar 2007 19:07:55 -0600 Subject: [Web-SIG] My summary of a web-platform Open-Space discussion at PyCon 2007 In-Reply-To: <45E99FE1.1090307@zetaweb.com> References: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com> <45E99FE1.1090307@zetaweb.com> Message-ID: <45EA1BEB.6000104@colorstudy.com> Chad Whitacre wrote: > Jim, > > > I'll summarize my recollections of a very useful discussion > > that several of us had at PyCon 2007. > > Looks accurate to me, thanks. > > > > - Ian will lead a server benchmark effort > > Where by "server," we mean core HTTP server library, yes? Yes, cherrypy.wsgiserver, paste.httpserver, twisted.web2, flup, etc. At openplans we (well, Luke) did some performance testing, in our case of an intermediary we're writing. The same basic pattern should fit this. I wrote a couple WSGI apps for that that showed particular kinds of behavior. I guess all I really did was an application that periodically was really slow. Another interesting case would be an application that yielded content very slowly. Different combinations of app_iter and start_response writer could be interesting. And of course the simplest example (which is usually all people do) of a trivial application that just serves up a single short string. Oh, and I should do one that serves up a large string, in one chunk and many chunks. Personally my own interest is in servers that act well even when the apps act poorly, more than the single case of a fast server with a perfect and fast application behind it. The perfect app is easy to test, so it'll be in there too of course, but just one of many. I think most of the work will be in setting up httperf with some scripts to invoke it and the other server. The other stuff can already be glued together quite easily by Paste Deploy. Well, not counting some of the servers that are harder to put together, like Apache+flup/fastcgi (or another server there), or mod_python generally. I suppose I'll just write up some simple httpd.conf's for these cases, and I guess I can fire it off from a script easily enough. Well, I'll probably look to someone else to do mod_python (and mod_wsgi before long), since I'm bad at setting those up. Once Apache+flup is setup, Apache+mod_python would probably be easy for someone who knows there way around. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From chad at zetaweb.com Sun Mar 4 05:27:29 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Sat, 3 Mar 2007 23:27:29 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) Message-ID: > >> > 2. I'm not clear on how Paste Deploy's abstractions map to the > >> > filesystem. What does my website root look like? > >> > >> The way I have generally configured websites like this is like: > >> > >> [composite:main] > >> use = egg:Paste#urlmap > >> / = config:root.ini > >> /blog = config:superblog.ini > > > > Right, that's the configuration, but where is "egg:Paste#urlmap" on > > the filesystem? Are the three ini files alone in some directory? Where > > is paste? Where is SuperBlog? Where is the rest of the site? I find it > > easier to start with the filesystem and then move up into > > object/config abstractions. > > You just have to understand what egg:Paste#urlmap is, probably from some > documentation. Admittedly that's boilerplate in the eyes of most people > who use it. It's there explicitly because Paste Deploy doesn't build > *any* WSGI anything into it, it only composes pieces, one of the most > common being urlmap. You can see docs for it with "paster points > paste.composite_factory urlmap", though I now notice I haven't written > any docs for it (bad of me), and that is hardly a simple command line. > I would certainly want to build a command-line help/browser (and > probably web one too) as part of a rewrite of the system. > > The three ini files do go in the same directory, though of course you > could do config:superblog/app.ini or something like that if you wanted > to set it up differently. It's a relative filename, relative to the > file where it is given. > > The applications themselves are eggs. You install them however you want > to install them (of course I'd strongly recommend workingenv, > virtual-python, or zc.buildout, but that's a separate concern). Some > people have mentioned some frustration about having to build full > libraries with a namespace, setup.py, eggs, etc. just to use > applications. But I think even pretty modest shops writing very one-off > apps gain a real benefit from these patterns, once you get over the > initial hump (and we can build tools to make the initial hump not so > bad, that's the point of paster create). Anyway, here's one reply I > made to that request: > http://pythonpaste.org/archives/message/20070215.192041.1534ce27.en.html > > There's a lot of practices around library management that *has* to be > done, because people use libraries. Most of this applies pretty well to > applications as well -- and since everyone *needs* to learn how to > manage their libraries, using the same mechanisms for managing > applications has some advantage. > > Incidentally, one change to the config format that would make it > possible to remove the explicit idea of "composite" apps, is to make > some key syntax that will instantiate the named object. E.g.,: > > app / = config:root.ini > > Then the keywords passed would just be {"/": }, instead > of the current {"/": "config:root.ini"} (where the "config:root.ini" is > passed to the loader object that the composite factory gets). Dude, my eyes are seriously glazing over. I want you to say something simple, like: $ cd /usr/local/www $ workingenv.py example.com ... $ cd example.com $ source bin/activate (example.com)$ mkdir etc Then stick a config file in etc/ and run a simple command to start your website. That's the kind of thing I imagine you doing (eh?), and it's also the thing that Aspen does. The difference is mostly in the config files. Now, Jim: it looks like Zope still uses a Unix-y userland for INSTANCE_HOME, yes? So that's Paste, Pylons(?), Aspen, Zope2 and Zope3 all using the same filesystem layout. IINM the filesystem structures of Django and CP/TurboGears are module-level (Bob?), so they could easily fit into lib/python. If we could agree on a really simple first-line config file that handles basic process configuration--address, user/group, threads, etc.--and then points to the next layer config--be it zope.conf, paste.ini, apps.conf, or settings.py--then we'd be pretty far towards a common app server. That is to say, I think we are really discussing three increasing levels of cooperation: 1) Server benchmarks and inter-op standards (Jim) 2) Common process management library (Bob) 3) Common web app server Without discouraging the first two efforts, I'd like to champion the third. Here would be my proposal: First, we define a "website" on the filesystem as a Unix-y userland with, at minimum, the following: etc/.conf lib/python Second, we adopt a simple ini-style format for .conf, which handles low-level process config. This file would then point to a second, framework-specific configuration layer. I suggest that this isn't too far from where we each are now, nor from where our discussion has already led. It fits long-established patterns (etc, ini), and doesn't preclude cooperation on benchmarks, inter-op, or libraries. Furthermore, collaborating here would spread around what amounts to grunt work, and give Python web deployment a simple and compelling story, while in no way crippling more advanced use cases. Are you guys interested in this proposal? If so, I can write it up in more detail. chad From fumanchu at amor.org Sun Mar 4 07:32:13 2007 From: fumanchu at amor.org (Robert Brewer) Date: Sat, 3 Mar 2007 22:32:13 -0800 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: morecomments on Paste Deploy) References: Message-ID: <435DF58A933BA74397B42CDEB8145A86224D50@ex9.hostedexchange.local> Chad Whitacre wrote: > First, we define a "website" on the filesystem as a > Unix-y userland with, at minimum, the following: > > etc/.conf > lib/python > > Second, we adopt a simple ini-style format for .conf, > which handles low-level process config. This file would > then point to a second, framework-specific configuration > layer. I really don't see why we need a standard scaffolding (folder arrangement) just to read in a config file. Why can't the location of the site config file be passed as an argument to the invocation script? Keep in mind that some platforms will not allow deployers write access to any folders in which application code is kept... Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20070303/64df1495/attachment.htm From grahamd at dscpl.com.au Mon Mar 5 00:28:26 2007 From: grahamd at dscpl.com.au (Graham Dumpleton) Date: Sun, 4 Mar 2007 18:28:26 -0500 Subject: [Web-SIG] Chunked Tranfer encoding on request content. Message-ID: <1173050906.11628@dscpl.user.openhosting.com> The WSGI specification doesn't really say much about chunked transfer encoding for content sent within the body of a request. The only thing that appears to apply is the comment: WSGI servers must handle any supported inbound "hop-by-hop" headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable. What does this really mean in practice though? As a means of getting feedback on what is the correct approach I'll go through how the CherryPy WSGI server handles it. The problem is that the CherryPy approach raises a few issues which makes me wander if it is doing it in the most appropriate way. In CherryPy, when it sees that the Transfer-Encoding is set to 'chunked' while parsing the HTTP headers, it will at that point, even before it has called start_response for the WSGI application, read in all content from the body of the request. CherryPy reads in the content like this for two reasons. The first is so that it can then determine the overall length of the content that was available and set the CONTENT_LENGTH value in the WSGI environ. The second reason is so that it can read in any additional HTTP header fields that may occur in the trailer after the last data chunk and also incorporate them into the WSGI environ. The first issue with what it does is that it has read in all the content. This denies a WSGI application the ability to stream content from the body of a request and process it a bit at a time. If the content is huge, that it buffers it can also mean the application process size will grow significantly. The second issue, although I am confused on whether the CherryPy WSGI server actually implements this correctly, is that if the client was expecting to see a 100 continue response, this will need to be sent back to the client before any content can be read. When chunked transfer encoding is not used, such a 100 continue response would in a good WSGI server only be sent when the WSGI application called read() on wsgi.input for the first time. Ie., the 100 continue indicates that the application which is consuming the data is actually ready to start processing it. What CherryPy WSGI server is doing is circumventing that and the client could think the final consumer application is ready before it actually is. Note that I am assuming here that 100 continue is still usable in conjunction with chunked transfer encoding. In CherryPy WSGI server it only actually sends the 100 continue after it attempts to try and read content in the presence of a chunked transfer encoding header. Not sure if this is actually a bug or not. CherryPy WSGI server also doesn't wait until first read() by WSGI application before sending back the 100 continue either and instead sends it as soon as the headers are parsed. This may be fine, but possibly not most optimal as it denies an application the ability to fail a request and avoid a client sending the actual content. Now, to my mind, the preferred approach would be that the content would not be read up front like this and instead CONTENT_LENGTH would simply be unset in the WSGI environ. >From prior discussions related to input filtering on the list, a WSGI application shouldn't really be paying much attention to CONTENT_LENGTH anyway and should just be using read() to get data until it returns an empty string. Thus, for chunked data, that it doesn't know the content length up front shouldn't matter as it should just call read() until there is no more. BTW, it may not be this simple for something like a proxy, but that is a discussion for another time. Doing this also means that the 100 continue only gets sent when the application is ready and there is no need to for the content to be buffered up. That it is the actual application which is consuming the data and not some intermediary means that an application could implement some mechanism whereby it reads some data, acts on that and starts sending some data in response. The client then might send more data based on that response which the application only then reads, send more data as response etc. Thus an end to end communication stream can be established where the actual overall content length of the request could never be established up front. The only problem with deferring any reading of data to when the application wants to actually read it, is that if the overall length of content in the request is bounded, there is no way to get access to the additional headers in the trailer of the request and have them available in the WSGI environ since processing of the WSGI environ has already occurred before any data was read. So, what gives. What should a WSGI server do for chunked transfer encoding on a request? I may not totally understand 100 continue and chunked transfer encoding and am happy to be correct in my understanding of them, but what CherryPy WSGI server does doesn't seem right to me at first look. Graham From sidnei at enfoldsystems.com Mon Mar 5 01:55:04 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Sun, 4 Mar 2007 21:55:04 -0300 Subject: [Web-SIG] Chunked Tranfer encoding on request content. In-Reply-To: <1173050906.11628@dscpl.user.openhosting.com> References: <1173050906.11628@dscpl.user.openhosting.com> Message-ID: I'm not quite aware of the 100 Continue semantics, but I know that applications which request Transfer-Encoding: chunked should *not* expect a Content-Length response header, nor should the WSGI thingie doing the 'chunking' need to know it in advance. 'chunked' is actually very simple. Simplifying it a lot, it basically needs to output '%x\r\n%s\r\n' % (len(chunk), chunk) for every chunk of data except the last which should be '0\r\n\r\n'. The only trick here is ensuring that no chunk of length '0' is written except the last. What might be happening is that CherryPy is outputting the whole response body as a single chunk, and relying on the 'Content-Length' header, which would be silly, I hope that's not what's happening though I haven't looked. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From grahamd at dscpl.com.au Mon Mar 5 02:33:38 2007 From: grahamd at dscpl.com.au (Graham Dumpleton) Date: Sun, 4 Mar 2007 20:33:38 -0500 Subject: [Web-SIG] Chunked Tranfer encoding on request content. Message-ID: <1173058418.9697@dscpl.user.openhosting.com> Sidnei da Silva wrote .. > I'm not quite aware of the 100 Continue semantics, but I know that > applications which request Transfer-Encoding: chunked should *not* > expect a Content-Length response header, nor should the WSGI thingie > doing the 'chunking' need to know it in advance. > > 'chunked' is actually very simple. Simplifying it a lot, it basically > needs to output '%x\r\n%s\r\n' % (len(chunk), chunk) for every chunk > of data except the last which should be '0\r\n\r\n'. The only trick > here is ensuring that no chunk of length '0' is written except the > last. > > What might be happening is that CherryPy is outputting the whole > response body as a single chunk, and relying on the 'Content-Length' > header, which would be silly, I hope that's not what's happening > though I haven't looked. I am not talking about the response body. I am talking about the body of the request. For example, the body of a POST request being sent from client to server. Graham From fumanchu at amor.org Mon Mar 5 03:02:25 2007 From: fumanchu at amor.org (Robert Brewer) Date: Sun, 4 Mar 2007 18:02:25 -0800 Subject: [Web-SIG] Chunked Tranfer encoding on request content. References: <1173050906.11628@dscpl.user.openhosting.com> Message-ID: <435DF58A933BA74397B42CDEB8145A86224D54@ex9.hostedexchange.local> Graham Dumpleton wrote: > In CherryPy, when it sees that the Transfer-Encoding > is set to 'chunked' while parsing the HTTP headers, > it will at that point, even before it has called > start_response for the WSGI application, read in all > content from the body of the request. > > CherryPy reads in the content like this for two reasons. > The first is so that it can then determine the overall > length of the content that was available and set the > CONTENT_LENGTH value in the WSGI environ. Right; IIRC the rfile just hangs if you try to read past Content-Length. Perhaps that can be fixed inside socket.makefile somewhere? > The second reason is so that it can read in any > additional HTTP header fields that may occur in > the trailer after the last data chunk and also > incorporate them into the WSGI environ. Yeah; I didn't see any other way to get Trailers into the environ. Perhaps that can be added to WSGI 2.0? I also just haven't had time to write a dechunker which worked on the fly. Patches welcome ;) > When chunked transfer encoding is not used, such a > 100 continue response would in a good WSGI server > only be sent when the WSGI application called read() > on wsgi.input for the first time. Sounds reasonable. Again, patches welcome ;) > Note that I am assuming here that 100 continue is > still usable in conjunction with chunked transfer > encoding. In CherryPy WSGI server it only actually > sends the 100 continue after it attempts to try > and read content in the presence of a chunked > transfer encoding header. Not sure if this is > actually a bug or not. It looks like a bug. The Expect header should be checked before decode_chunked (at least until the 100 response can be moved inside read()). Thanks for catching those! Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20070304/643b065e/attachment.html From sidnei at enfoldsystems.com Mon Mar 5 03:13:11 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Sun, 4 Mar 2007 23:13:11 -0300 Subject: [Web-SIG] Chunked Tranfer encoding on request content. In-Reply-To: <1173058418.9697@dscpl.user.openhosting.com> References: <1173058418.9697@dscpl.user.openhosting.com> Message-ID: On 3/4/07, Graham Dumpleton wrote: > I am not talking about the response body. I am talking about the body of > the request. For example, the body of a POST request being sent from > client to server. Ah, ok. Anyway I don't see why it would need to read the whole body to do chunked. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From grahamd at dscpl.com.au Mon Mar 5 05:50:43 2007 From: grahamd at dscpl.com.au (Graham Dumpleton) Date: Sun, 4 Mar 2007 23:50:43 -0500 Subject: [Web-SIG] Chunked Tranfer encoding on request content. Message-ID: <1173070243.5536@dscpl.user.openhosting.com> Robert Brewer wrote .. > Graham Dumpleton wrote: > > In CherryPy, when it sees that the Transfer-Encoding > > is set to 'chunked' while parsing the HTTP headers, > > it will at that point, even before it has called > > start_response for the WSGI application, read in all > > content from the body of the request. > > > > CherryPy reads in the content like this for two reasons. > > The first is so that it can then determine the overall > > length of the content that was available and set the > > CONTENT_LENGTH value in the WSGI environ. > > Right; IIRC the rfile just hangs if you try to read > past Content-Length. Perhaps that can be fixed inside > socket.makefile somewhere? > > > The second reason is so that it can read in any > > additional HTTP header fields that may occur in > > the trailer after the last data chunk and also > > incorporate them into the WSGI environ. > > Yeah; I didn't see any other way to get Trailers into > the environ. Perhaps that can be added to WSGI 2.0? Don't know how you could cater for trailers in WSGI 2.0 without coming up with some totally new scheme of passing such additional information to the WSGI application. First idea I can think of at present is that if chunked transfer encoding that WSGI server sets 'wsgi.trailers' as an empty dictionary which it keeps a reference to and only populates when it actually encounters the trailers. Ie., only guaranteed to be set when read() finally returns an empty string. Any middleware would have to be obligated to pass the reference though and not actually copy the dictionary so that changes made later back at WSGI server layer would be available to application. Second idea I can think of is a new member function in 'wsgi.input' called 'trailers()' which could be used to access them. Alternatively, 'wsgi.trailers' could also be a function. Either way, it could return None when not yet known and dictionary when it is. One problem with this is that in Apache, when the trailers are encountered, the lower level HTTP filter simply merges them on top of the existing input headers. You don't want to pass the full set of input headers again, so simply means the WSGI adapter for Apache would need to remember what headers it sent in environ to begin with and only put in trailers what had changed and thus were actually in the trailer. Anyway, it looks for the time being that if I am going to support streaming of chunked data that I state as a limitation that trailers aren't available as WSGI doesn't support a way of getting them. BTW, I looked around at the various packages trying to provide a WSGI server and I can't find one besides CherryPy WSGI server that even attempts to support chunked encoding on input. Makes it hard to use what other people did as a guide. :-( Graham From jim at zope.com Mon Mar 5 12:28:15 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 06:28:15 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <435DF58A933BA74397B42CDEB8145A86224D4D@ex9.hostedexchange.local> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> <435DF58A933BA74397B42CDEB8145A86224D4D@ex9.hostedexchange.local> Message-ID: <82E00AFB-0425-487C-A55B-1BD5DAE6E247@zope.com> On Mar 3, 2007, at 6:19 PM, Robert Brewer wrote: > Jim Fulton wrote: > > For some time, Zope has used a daemon-management tool > > we wrote called zdaemon: > > > > http://www.python.org/pypi/zdaemon > > > > Ironically, this sort of tool isn't Python specific at all, > > and the discussion highlighted some non-Python tools, notably > > daemontools and runit, neither of which seemed as appealing > > as zdaemon for various reasons. > > The user interface isn't Python-specific, but the interaction with > WSGI servers, middleware, applications, and frameworks should be. I don't think we are talking about the same thing. See my comment at the end of this note. > Components at all levels of the WSGI stack need to interact with > "site-wide" events and settings. What I'm envisioning (and writing > for CP at the moment) is a framework-neutral, one-per-site Engine > object that is basically a publish/subscribe messenger; when you > import a Python web framework, it registers listeners for process > start, stop, and graceful restart. These would be things that need > to happen regardless of the OS process invoker: whether a common > 'webctl' script (that we author), or a framework-specific function > (like cherrypy.quickstart), or Apache (via mod_python). I encourage you to look at the zope event system which already supports this use case: http://www.python.org/pypi/zope.event http://www.python.org/pypi/zope.component#handlers > The pub/sub model also supports plugins with their own channel(s). > For example, frameworks would blindly call engine.publish > ('autoreload.add', filename) as desired. If the invoker (webctl, > quickstart, or Apache) plugs in an autoreloader, great; it > subscribes to that channel, receives each message, and adds each > filename to its list of files to monitor. If no autoreloader has > been plugged in, the 'add' message is correctly ignored. And when > the autoreloader detects a change, it would also publish 'reload' > or 'reexec' messages, which would then be subscribed to by a Reexec > plugin. Most of the plugins would be provided by the invoker, but > frameworks would be free to use the Engine to register their own > events and event listeners. > > This interface between a site-wide container and the WSGI > components is far more important to me than the actual details of > invocation (like forking, signal-handling, logging, etc). The > latter can be written as Engine plugins, and can compete in a > market created by a good "Web Site Engine Interface" spec. I think you're "sitewide container" is the main program that loads the WSGI components. This might be Apache, if mod_python is used, or some Python script/program. I was discussing a tool that managed the main program in the later case. Something that started and restarted it, provided status information, helped it to run as a proper daemon and so on. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Mar 5 12:59:09 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 06:59:09 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: References: Message-ID: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> On Mar 3, 2007, at 11:27 PM, Chad Whitacre wrote: ... > Now, Jim: it looks like Zope still uses a Unix-y userland for > INSTANCE_HOME, yes? Yes, but I hate it. At Zope Corporation, We're moving away from it for a number of reasons. For development, it adds structure that isn't needed. A Zope instance really only needs a few files. Trying to minic some notional unix layout just adds pointless structure. The traditional complex Zope instance file layout lead to the use of an instance "skeleton" to deal with all of the files, which led, in turn, to a copy and hack style of configuration customization that is inflexible and encourages cruft. For production deployments, we (Zope Corporation) install files into the *real* Unix tree where site administrators want them. We'll typically have a deployment that includes a number of applications. The deployment will create directories in /etc, /var/log, and /var/ run, where the applications in the deployment put their configuration, log, and run-time files. They may also put files in places like /etc/init.d, and /etc/cron.d. The point being that this looks nothing like a traditional Zope instance installation. Keeping the number of files used by an application minimal makes it easier deal with the different needs of development and deployment and makes it easier, at least for me, to deal with different configurations. ... > 1) Server benchmarks and inter-op standards (Jim) Ian said he would lead this. 2) Common framework for WSGI application composition. > 2) Common process management library (Bob) > 3) Common web app server Not sure what this is. > > Without discouraging the first two efforts, I'd like to champion the > third. Here would be my proposal: > > First, we define a "website" on the filesystem as a Unix-y userland > with, at minimum, the following: > > etc/.conf > lib/python -1 for reasons I've already described I'll note that I find lib/python especially silly. Why have a lib directory that contains a single subdirectory. We started this a long long time ago with Zope because that's how Python installed it's own modules on Unix systems at the time. Since then. Python has switched to lib/pythonV.V. We don't mimic that for hysterical reasons. If someone really wanted to mimic how modules got installed into modern Unix Python installs, they'd use lib/pythonV.V/site- packages, which would be the height of absurdity. In practice, at least for us at Zope Corporation, our process instances don't have any Python modules. We have application definitions that contain the modules we use and multiple process instances of each application that contain only configuration data. > Second, we adopt a simple ini-style format for .conf, which > handles low-level process config. This file would then point to a > second, framework-specific configuration layer. We do something like this now. It don't require any particular file- system layout. The devil is in the details. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Mar 5 13:05:27 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 07:05:27 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E9E091.3070603@colorstudy.com> References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com> Message-ID: On Mar 3, 2007, at 3:54 PM, Ian Bicking wrote: > Chad Whitacre wrote: >> All, >> >> Thanks, Jim and Ian, for bringing this discussion online. >> >> I have two hesitations with Paste Deploy: >> >> 1. The configuration syntax is really complex. I'm much more >> comfortable with multiple simpler config files. > > Is it really that complex? I don't think so, otoh, you make some good points. :) > There's a few too many ways to do middleware > around applications, I'm afraid. Yes > get/set is really a rather obscure > feature that I seldom use. I don't remember seeing this in the documentation. > The distinction between "composite" and > "app" isn't necessary, I think. Agreed > The ability to inherit from sections is really useful IMHO (though not > well described in documentation); that's where you do something like > "use = other_section", and then add settings that override that other > section's settings. Yes, but the way it is overloaded with selecting an entry point and referring to another configuration file is confusing. I Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Mar 5 13:06:22 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 07:06:22 -0500 Subject: [Web-SIG] My summary of a web-platform Open-Space discussion at PyCon 2007 In-Reply-To: <45E99FE1.1090307@zetaweb.com> References: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com> <45E99FE1.1090307@zetaweb.com> Message-ID: <1FF57434-BFD2-4892-B724-71D2D616250B@zope.com> On Mar 3, 2007, at 11:18 AM, Chad Whitacre wrote: > Jim, > > > I'll summarize my recollections of a very useful discussion > > that several of us had at PyCon 2007. > > Looks accurate to me, thanks. > > > > - Ian will lead a server benchmark effort > > Where by "server," we mean core HTTP server library, yes? Yes, WSGI server implementatuo > > My impression is that there isn't a lot of appetite for > > standardizing on a common pain application. > > Sorry, "pain application?" :^) :) "main application". > I assume you mean a common app server executable, as opposed to > best practice docs, entry point standards, maybe even libraries, > etc. Yes? Yes. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From sidnei at enfoldsystems.com Mon Mar 5 15:16:30 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 5 Mar 2007 11:16:30 -0300 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> Message-ID: On 3/5/07, Jim Fulton wrote: > For production deployments, we (Zope Corporation) install files into > the *real* Unix tree where site administrators want them. We'll > typically have a deployment that includes a number of applications. > The deployment will create directories in /etc, /var/log, and /var/ > run, where the applications in the deployment put their > configuration, log, and run-time files. They may also put files in > places like /etc/init.d, and /etc/cron.d. The point being that this > looks nothing like a traditional Zope instance installation. How do you see that mapping to win32? There's no '/etc', '/etc/init.d' equivalent would be the current 'zopeservice.py', and '/etc/cron.d' equivalent would be 'scheduled tasks'. I believe '/var/log' could be replaced by logging to the 'nt event log', there are lots of tools to work with that. That still leaves '/etc/' and '/var/run' in the air. I guess they could just be right into the application directory? -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From jim at zope.com Mon Mar 5 16:02:42 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 10:02:42 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E99DC1.4010703@zetaweb.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> Message-ID: <57C175B1-A485-4FEF-908C-7B849F576D5E@zope.com> On Mar 3, 2007, at 11:09 AM, Chad Whitacre wrote: ... > > 1. Can we agree on a standard set of entry points so that WSGI > > applications can be combined automatically? I think Paste > > Deploy provides at least good start on this. > > > > You haven't commented on the entry points defined by Paste > > Deploy. Do you have an opinion on adopting the entry-point API > > defined by Paste Deploy? > > Ok, I need help: defining an entry point allows a plugin to > advertise that it can satisfy that entry point, but you still need > a configuration layer to actually wire it up, no? Yes. > In which case: > > 1) What does "automatically" mean? It means that you don't have to write Python code to connect applications, servers, and middleware. > 2) Aren't we back to discussing config syntax? No. Entry points can be used by a variety of configuration syntaxes and by Python code. I should note that we can divide this discussion further, if we wish. Paste Deploy defines APIs and entry points for advertising objects that provide those APIs. The APIs are arguably the most essential thing to reuse from Paste Deploy. Entry points add *a* mechanism to make those objects a bit more discoverable. Arguably, specifying an application via: eggname#entrypointname doesn't provide much advantage over simply specifying the dotted path to an object in a module. If there were more tools for browsing for and working with eggs, then I think entry points would provide greater advantages as they would allow the tools to guide someone deciding how to reuse an egg by telling them about the components available. Personally, I think that use of entry points makes sense in a situation like this. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Mar 5 18:14:55 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 12:14:55 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> Message-ID: <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com> On Mar 5, 2007, at 9:16 AM, Sidnei da Silva wrote: > On 3/5/07, Jim Fulton wrote: >> For production deployments, we (Zope Corporation) install files into >> the *real* Unix tree where site administrators want them. We'll >> typically have a deployment that includes a number of applications. >> The deployment will create directories in /etc, /var/log, and /var/ >> run, where the applications in the deployment put their >> configuration, log, and run-time files. They may also put files in >> places like /etc/init.d, and /etc/cron.d. The point being that this >> looks nothing like a traditional Zope instance installation. > > How do you see that mapping to win32? There's no '/etc', '/etc/init.d' > equivalent would be the current 'zopeservice.py', and '/etc/cron.d' > equivalent would be 'scheduled tasks'. I believe '/var/log' could be > replaced by logging to the 'nt event log', there are lots of tools to > work with that. That still leaves '/etc/' and '/var/run' in the air. I > guess they could just be right into the application directory? We don't deploy to win32 and I don't know enough about win32 to answer. I expect though that, like Unix, a production deployment is going to look different than a development buildout. In any case, I'm pretty sure that the classic unix-mimicing layout has no advantages for win32. :) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From sidnei at enfoldsystems.com Mon Mar 5 18:25:06 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 5 Mar 2007 14:25:06 -0300 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com> Message-ID: On 3/5/07, Jim Fulton wrote: > We don't deploy to win32 and I don't know enough about win32 to > answer. I expect though that, like Unix, a production deployment is > going to look different than a development buildout. In any case, > I'm pretty sure that the classic unix-mimicing layout has no > advantages for win32. :) Well, it is something that needs to be considered though. We can't just close one eye and pretend that win32 does not exist. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From jtate at rpath.com Mon Mar 5 18:54:56 2007 From: jtate at rpath.com (Joseph Tate) Date: Mon, 5 Mar 2007 12:54:56 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45E9E091.3070603@colorstudy.com> References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com> Message-ID: <200703051254.57032.jtate@rpath.com> On Saturday 03 March 2007 15:54:41 Ian Bicking wrote: > Chad Whitacre wrote: > > I suggest that a system with multiple simple config files is much > > more scalable than a single complex config file syntax. Imagine > > if all of Unix were configured using a single syntax! > > There's other cases where having both options is nice. Because Paste > Deploy doesn't fold config files together, you can also reuse them from > different contexts. (A more common way to use multiple config files -- > what ConfigParser.load supports -- is to just overlap all the sections, > usually totally clobbering each other. I like this more explicit way of > bringing in configuration, which treats configuration like a composable > set of configurations instead of a system where all the configuration > files are pretty tightly bound to each other.) I find that multiple files gives you a nice way to override defaults. As long as the files are read in a way that's predictable and documentable, and ultimately appear as if read from a single file (and possible displayable via some diagnostics link in an application). -- Joseph Tate Software Engineer rPath Inc. http://www.rpath.com/rbuilder/ (919) 851-3984 x2106 From jtate at rpath.com Mon Mar 5 18:25:10 2007 From: jtate at rpath.com (Joseph Tate) Date: Mon, 5 Mar 2007 12:25:10 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> Message-ID: <200703051225.10896.jtate@rpath.com> On Saturday 03 March 2007 11:08:24 Jim Fulton wrote: > > Anyway, I share this for your consideration. There are probably > better tools out there than zdaemon and supervisor2, but I'm not > aware of them. :) I'm curious what other people have found or use. ll.daemon (http://www.livinglogic.de/Python/daemon/index.html) seems to be a straightforward and very simple library for core daemon functionality. Daemontools isn't very well respected by the SysV style initscript crowd, and vice versa. That's an external non python dependency, and not commonly available. Certainly not available on Windows. I have written my own daemon base class (Pretty restrictive license [reciprocal], but I'm sure I could get it loosened). http://hg.rpath.com/raa-1.1?f=9ac380d082f4;file=raa/service/daemon.py I'm not married to it though, so would be happy to spin it out and remove the conary requirements, or just junk it. -- Joseph Tate Software Engineer rPath Inc. http://www.rpath.com/rbuilder/ (919) 851-3984 x2106 From chad at zetaweb.com Mon Mar 5 19:14:27 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Mon, 05 Mar 2007 13:14:27 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <200703051225.10896.jtate@rpath.com> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> <200703051225.10896.jtate@rpath.com> Message-ID: <45EC5E03.2070304@zetaweb.com> > ll.daemon (http://www.livinglogic.de/Python/daemon/index.html) > seems to be a straightforward and very simple library for core > daemon functionality. I'm using this in Aspen, and I like it. Worth checking out. chad From smulloni at smullyan.org Mon Mar 5 18:57:40 2007 From: smulloni at smullyan.org (Jacob Smullyan) Date: Mon, 5 Mar 2007 12:57:40 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com> Message-ID: <20070305175740.GA7319@smullyan.org> On Mon, Mar 05, 2007 at 02:25:06PM -0300, Sidnei da Silva wrote: > Well, it is something that needs to be considered though. We can't > just close one eye and pretend that win32 does not exist. Yes, I prefer to close two eyes! -- Jacob Smullyan From jtate at rpath.com Mon Mar 5 19:27:23 2007 From: jtate at rpath.com (Joseph Tate) Date: Mon, 5 Mar 2007 13:27:23 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: References: Message-ID: <200703051327.23326.jtate@rpath.com> On Saturday 03 March 2007 23:27:29 Chad Whitacre wrote: > 3) Common web app server > > Without discouraging the first two efforts, I'd like to champion the > third. Here would be my proposal: > > First, we define a "website" on the filesystem as a Unix-y userland > with, at minimum, the following: > > etc/.conf > lib/python > > Are you guys interested in this proposal? If so, I can write it up in > more detail. No, and here's why. Most apps are deployed as eggs. This is a relatively high ante to pay for complicated setups, but boilerplate setup.py code solves the 80% case well enough. Using eggs means that the apps could be installed in different locations, site-packages, user's own pythonpath, anywhere. The config file or files are going to be what determines what gets loaded, and where, much more than os.getcwd(). The configuration can be determined via searching well known locations /etc/foo.cfg ~/.foo.cfg ./foo.cfg, etc. or passed in on the command line.[1] References to apps will be to their eggs, which will be loaded from the Python path. Installing eggs to arbitrary file system locations, while it can be done, doesn't lend itself to super-packaging (rpm, dpg, installshield, etc). It also requires more setup by the end user/deployer than just running ez_install foo_app, or rpm -i foo_app.rpm. Also, user-land servers are not that interesting to me. They're great for development, but production use is where I see the pain. I'm interested in a common app server platform that focuses on running one or more applications from an egg (which could, and perhaps should include it's own configuration) mounted at different url locations. [1] This use of current working directory for configuration file loading could be used in the specialized aspen case for an exploded egg in an arbitrary file location. -- Joseph Tate Software Engineer rPath Inc. http://www.rpath.com/rbuilder/ (919) 851-3984 x2106 From fumanchu at amor.org Mon Mar 5 19:38:51 2007 From: fumanchu at amor.org (Robert Brewer) Date: Mon, 5 Mar 2007 10:38:51 -0800 Subject: [Web-SIG] daemon tools In-Reply-To: <82E00AFB-0425-487C-A55B-1BD5DAE6E247@zope.com> Message-ID: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> Jim Fulton wrote: > For some time, Zope has used a daemon-management tool > we wrote called zdaemon: > > http://www.python.org/pypi/zdaemon > > Ironically, this sort of tool isn't Python specific at all, > and the discussion highlighted some non-Python tools, notably > daemontools and runit, neither of which seemed as appealing > as zdaemon for various reasons. and Robert Brewer replied: > The user interface isn't Python-specific, but the interaction with > WSGI servers, middleware, applications, and frameworks should be. and Jim answered: > I don't think we are talking about the same thing... > > I encourage you to look at the zope event system which already > supports this use case: > > http://www.python.org/pypi/zope.event Yes, and Django has a similar mechanism which they call "signals": http://code.djangoproject.com/wiki/Signals What several people have asked for is the ability to combine applications (and WSGI components) from a variety of frameworks into a single "website". What I'm proposing is that we standardize on a set of topics/channels/events/signals that are "site-wide" events, like start, stop, restart and graceful. If we collaborated on a tool to manage those, we could potentially make the codebases of each project smaller, not just by removing the event manager, but by collaborating on a set of standard event handlers, one of which could be a "daemonize me" handler. What we have now: CherryPy Zope Django -------- ------ ------- ??? events signals | | | autoreload ??? autoreload | | | engine zdrun ??? | | | ??? zdctl ??? What we could have instead: webctl modpython_gateway | / ------------ pywebd ------------ / | \ -------- ------ ------ CherryPy Zope Django ...where the "pywebd" module: 1. Composes the WSGI stack (provides a library to do so at least), 2. Notifies frameworks of site-wide events (like start, stop, restart and graceful), 3. Provides plugins that frameworks can "notify"; for example, adding files to an autoreload plugin. > I think your "sitewide container" is the main program that loads > the WSGI components. This might be Apache, if mod_python is > used, or some Python script/program. Apache itself is not going to be the chunk of code that loads the WSGI components. In my head, a modpython_gateway module (or something similar) would ask pywebd to do that. > I was discussing a tool that managed the main program in the > latter case. Something that started and restarted it, provided > status information, helped it to run as a proper daemon and so on. Sure, something like zdctl? But zdctl doesn't do the actual fork, zdrun does...so what does "help run as a proper daemon" mean? Robert Brewer System Architect Amor Ministries fumanchu at amor.org From sidnei at enfoldsystems.com Mon Mar 5 19:42:28 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 5 Mar 2007 15:42:28 -0300 Subject: [Web-SIG] The importance of deploying Python-based web apps on Windows (was: Re: [Proposal] "website" and first-level conf) Message-ID: On 3/5/07, Jacob Smullyan wrote: > On Mon, Mar 05, 2007 at 02:25:06PM -0300, Sidnei da Silva wrote: > > Well, it is something that needs to be considered though. We can't > > just close one eye and pretend that win32 does not exist. > > Yes, I prefer to close two eyes! I seriously hope you are kidding. Unfortunately that's not possible. A lot of people, specially when evaluating open-source projects, have their first contact with the software through the Windows platform. To quote some numbers, the Plone Installer for Windows has roughly 3x more downloads than any of the second most download package [1]. Now, I see clearly two options for open-source projects: have a Windows story and increase your downloads by X%, where X can be a number between 50-300 *wink*, or not have a Windows story and relying on the *nix crowd to be the sole consumers of your software. When you talk to a big organization that is already deploying their applications on the Windows platform what story you want to tell them? 'Oh, and by the way, all your investment on Windows software, you will have to throw all that away if you want to use our software'. Good luck with that. I think that it's pretty important that Python-based web apps have as good of a story on Windows as it has in other fields (pywin32 comes to mind) but feel free to disagree. Sorry for the rant. [1] http://tinyurl.com/2dfx37 -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From jim at zope.com Mon Mar 5 21:23:56 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 5 Mar 2007 15:23:56 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com> Message-ID: <33811262-2044-4B84-8921-9BC481564213@zope.com> On Mar 5, 2007, at 12:25 PM, Sidnei da Silva wrote: > On 3/5/07, Jim Fulton wrote: >> We don't deploy to win32 and I don't know enough about win32 to >> answer. I expect though that, like Unix, a production deployment is >> going to look different than a development buildout. In any case, >> I'm pretty sure that the classic unix-mimicing layout has no >> advantages for win32. :) > > Well, it is something that needs to be considered though. We can't > just close one eye and pretend that win32 does not exist. I wasn't suggesting we shouldn't consider it. I just don't think win32 will change my opinion of what I think about a unix-inspired instance layout. Someone should think about windows who actually uses it. I am not a windows server administrator, so I can't suggest how deploying applications on windows servers would effect file placement or layout. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From sidnei at enfoldsystems.com Mon Mar 5 21:48:35 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 5 Mar 2007 17:48:35 -0300 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: <33811262-2044-4B84-8921-9BC481564213@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com> <33811262-2044-4B84-8921-9BC481564213@zope.com> Message-ID: On 3/5/07, Jim Fulton wrote: > > On 3/5/07, Jim Fulton wrote: > >> We don't deploy to win32 and I don't know enough about win32 to > >> answer. I expect though that, like Unix, a production deployment is > >> going to look different than a development buildout. In any case, > >> I'm pretty sure that the classic unix-mimicing layout has no > >> advantages for win32. :) > > > > Well, it is something that needs to be considered though. We can't > > just close one eye and pretend that win32 does not exist. > > I wasn't suggesting we shouldn't consider it. I just don't think > win32 will change my opinion of what I think about a unix-inspired > instance layout. > > Someone should think about windows who actually uses it. I am not a > windows server administrator, so I can't suggest how deploying > applications on windows servers would effect file placement or layout. Thanks for the clarification. So can I suggest that when the tools for deploying are created, that they be extensible so that someone can come in after the fact and put the win32-specific code in place without having to rewrite everything from scratch? Things that come to my mind are: - logging (should be able to swap file-based logging by nt event log logging for example). With ZConfig/zope.conf this an easy task. - 'cron'-like things, should be able to read settings from a file and install scheduled tasks that run the same scripts on Windows - 'service' code, should be able to have a generic service wrapper that can run anything as a service. - Application shouldn't rely on *nix signals, or should be made extensible to handle Windows 'named events', which are equivalent but not quite the same. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From ianb at colorstudy.com Mon Mar 5 22:19:14 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 05 Mar 2007 15:19:14 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <200703051254.57032.jtate@rpath.com> References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com> <200703051254.57032.jtate@rpath.com> Message-ID: <45EC8952.1040703@colorstudy.com> Joseph Tate wrote: > On Saturday 03 March 2007 15:54:41 Ian Bicking wrote: >> Chad Whitacre wrote: >>> I suggest that a system with multiple simple config files is much >>> more scalable than a single complex config file syntax. Imagine >>> if all of Unix were configured using a single syntax! >> There's other cases where having both options is nice. Because Paste >> Deploy doesn't fold config files together, you can also reuse them from >> different contexts. (A more common way to use multiple config files -- >> what ConfigParser.load supports -- is to just overlap all the sections, >> usually totally clobbering each other. I like this more explicit way of >> bringing in configuration, which treats configuration like a composable >> set of configurations instead of a system where all the configuration >> files are pretty tightly bound to each other.) > > I find that multiple files gives you a nice way to override defaults. As long > as the files are read in a way that's predictable and documentable, and > ultimately appear as if read from a single file (and possible displayable via > some diagnostics link in an application). Allowing this sort of thing means that the application carries around a complete config object of some sort, which I rather dislike -- it allows for smart applications, but it makes it much harder to understand the configuration and any possible side effects. If we resolve the configuration down to something more limited (as the Paste Deploy entry points do) you can't really reconstruct the config from there. *Something* could still reconstruct the config (an alternate config loader, via logs, via debug settings, etc), just not the application itself. This is somewhat problematic for applications that have particularly complex config requirements, or want to support self-configuration. The best solution that I can think of with Paste Deploy in that case is to just use the Paste Deploy configuration to point to the "real" configuration. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From ianb at colorstudy.com Mon Mar 5 22:23:46 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 05 Mar 2007 15:23:46 -0600 Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more comments on Paste Deploy) In-Reply-To: References: Message-ID: <45EC8A62.8060805@colorstudy.com> Chad Whitacre wrote: >> >> > 2. I'm not clear on how Paste Deploy's abstractions map to the >> >> > filesystem. What does my website root look like? >> >> >> >> The way I have generally configured websites like this is like: >> >> >> >> [composite:main] >> >> use = egg:Paste#urlmap >> >> / = config:root.ini >> >> /blog = config:superblog.ini >> > >> > Right, that's the configuration, but where is "egg:Paste#urlmap" on >> > the filesystem? Are the three ini files alone in some directory? Where >> > is paste? Where is SuperBlog? Where is the rest of the site? I find it >> > easier to start with the filesystem and then move up into >> > object/config abstractions. >> >> You just have to understand what egg:Paste#urlmap is, probably from some >> documentation. Admittedly that's boilerplate in the eyes of most people >> who use it. It's there explicitly because Paste Deploy doesn't build >> *any* WSGI anything into it, it only composes pieces, one of the most >> common being urlmap. You can see docs for it with "paster points >> paste.composite_factory urlmap", though I now notice I haven't written >> any docs for it (bad of me), and that is hardly a simple command line. >> I would certainly want to build a command-line help/browser (and >> probably web one too) as part of a rewrite of the system. >> >> The three ini files do go in the same directory, though of course you >> could do config:superblog/app.ini or something like that if you wanted >> to set it up differently. It's a relative filename, relative to the >> file where it is given. >> >> The applications themselves are eggs. You install them however you want >> to install them (of course I'd strongly recommend workingenv, >> virtual-python, or zc.buildout, but that's a separate concern). Some >> people have mentioned some frustration about having to build full >> libraries with a namespace, setup.py, eggs, etc. just to use >> applications. But I think even pretty modest shops writing very one-off >> apps gain a real benefit from these patterns, once you get over the >> initial hump (and we can build tools to make the initial hump not so >> bad, that's the point of paster create). Anyway, here's one reply I >> made to that request: >> http://pythonpaste.org/archives/message/20070215.192041.1534ce27.en.html >> >> There's a lot of practices around library management that *has* to be >> done, because people use libraries. Most of this applies pretty well to >> applications as well -- and since everyone *needs* to learn how to >> manage their libraries, using the same mechanisms for managing >> applications has some advantage. >> >> Incidentally, one change to the config format that would make it >> possible to remove the explicit idea of "composite" apps, is to make >> some key syntax that will instantiate the named object. E.g.,: >> >> app / = config:root.ini >> >> Then the keywords passed would just be {"/": }, instead >> of the current {"/": "config:root.ini"} (where the "config:root.ini" is >> passed to the loader object that the composite factory gets). > > Dude, my eyes are seriously glazing over. I want you to say something > simple, like: > > $ cd /usr/local/www > $ workingenv.py example.com > ... > $ cd example.com > $ source bin/activate > (example.com)$ mkdir etc > > Then stick a config file in etc/ and run a simple command to start > your website. But you are just hand-waving over the exact part that I am describing ("stick a config file in etc/"). What does that config file look like? How do you handle different cases with it? I cover a lot of pretty normal use cases up there. > That's the kind of thing I imagine you doing (eh?), and it's also the > thing that Aspen does. The difference is mostly in the config files. > > Now, Jim: it looks like Zope still uses a Unix-y userland for > INSTANCE_HOME, yes? So that's Paste, Pylons(?), Aspen, Zope2 and Zope3 > all using the same filesystem layout. IINM the filesystem structures > of Django and CP/TurboGears are module-level (Bob?), so they could > easily fit into lib/python. > > If we could agree on a really simple first-line config file that > handles basic process configuration--address, user/group, threads, > etc.--and then points to the next layer config--be it zope.conf, > paste.ini, apps.conf, or settings.py--then we'd be pretty far towards > a common app server. Part of why I push Paste Deploy is because every simpler or more abstract config idea could just as easily be composed as a Paste Deploy entry point. That is, one can create the abstract idea of a config loader, but that requires all the same boiler plate that a minimal Paste Deploy config file has anyway. Which is not to say someone might not want to write a different loader, but I don't think adding another layer of abstraction that's more neutral helps. > That is to say, I think we are really discussing three increasing > levels of cooperation: > > 1) Server benchmarks and inter-op standards (Jim) > 2) Common process management library (Bob) > 3) Common web app server > > Without discouraging the first two efforts, I'd like to champion the > third. Here would be my proposal: > > First, we define a "website" on the filesystem as a Unix-y userland > with, at minimum, the following: > > etc/.conf > lib/python > > Second, we adopt a simple ini-style format for .conf, which > handles low-level process config. This file would then point to a > second, framework-specific configuration layer. If it's framework-specific, how do you determine what the framework is? You need some kind of slug to do that, or else a separate runner. That also doesn't really do anything for composing multiple different applications that happen to use different frameworks. Personally I find framework-specific configuration rather dumb, because the point of all this isn't to build *frameworks*, it's to build *applications*, and frameworks are just an implementation details of an application. One could say that it would be better if the application shipped its own setup, meaning its own appctl script. This doesn't allow very well for wrapping or composing applications, but it's a valid thing to provide. But I don't think your proposal goes in that direction. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From pje at telecommunity.com Mon Mar 5 22:38:51 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 05 Mar 2007 16:38:51 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <57C175B1-A485-4FEF-908C-7B849F576D5E@zope.com> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> Message-ID: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> At 10:02 AM 3/5/2007 -0500, Jim Fulton wrote: >Entry points add *a* mechanism to make those objects a bit more >discoverable. Arguably, specifying an application via: >eggname#entrypointname doesn't provide much advantage over simply >specifying the dotted path to an object in a module. Actually, it provides one very important strategic advantage that I don't think has been mentioned in this conversation. A configuration format that can specify project/version information can be used as a single-file deployment spec for an easy_install wrapper or buildout-like tool. The advantage of this for virtual hosting providers in particular is significant -- if they support the tool, they can support this one-file deployment scheme. Personally, I don't care for the Paste Deploy syntax -- frankly it's almost barbaric. :) But the concept of being able to specify stacks, routes, and configuration in a plain text format that includes package information for automated deployment is nonetheless an important one. A couple years back, I started writing a library to parse a more sophisticated, Python-like syntax to do the same sorts of things, but only got as far as the parser. One discussion was here: http://mail.python.org/pipermail/web-sig/2005-August/001714.html The basic idea behind the syntax was that assignments are like keyword arguments, and non-assignment statements are positional arguments. I'm not altogether happy with that syntax either, however, as it has a little too much "more than one way to do it", which is one reason I never finished the implementation. There is a library that parses it (and does other general-purpose Python-like DSL parsing) at: ViewSVN: http://svn.eby-sarna.com/SCALE/ Checkout: svn://svn.eby-sarna.com/svnroot/SCALE/ Docs: http://peak.telecommunity.com/DevCenter/scale.dsl#parsing-declarations Anyway, all that aside, I think it would be fantastic if we could come up with some "universal file format" for single-file configuration and deployment of applications (including auto-install of all needed eggs), that could get stdlib support and ultimately hosting company support. This would actually give us a leg up on even PHP for ease-of-deployment. In truth, it doesn't matter if the file *contents* are standardized. Standardization could be as simple as defining a #! line like: #!/usr/bin/pydeploy2.3 SomeFormatEgg==1.1 Where "SomeFormatEgg" offers a "python.deploy" entry point for running the file, and the pydeploy tool obtains the necessary egg and provides libraries for the parsing tool to auto-locate and install any eggs needed by the body. This could also be a basis for bootstrapping other systems, including perhaps buildouts (e.g. "#!/usr/bin/pydeploy2.4 zc.buildout" at the top of a buildout .ini)! So, while a single content format would be nice, we don't even need that in order to get a raw deployment system standard. Perhaps I should build this hypothetical pydeploy tool into setuptools 0.7? From pje at telecommunity.com Mon Mar 5 22:39:23 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 05 Mar 2007 16:39:23 -0500 Subject: [Web-SIG] wsgiref and wsgi.multithread/wsgi.multiprocess In-Reply-To: <20070209175649.GA21915@caltech.edu> References: <5.1.1.6.0.20070209120902.038b7e20@sparrow.telecommunity.com> <20070209075401.GA9697@caltech.edu> <5.1.1.6.0.20070209120902.038b7e20@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070305163900.02a48088@sparrow.telecommunity.com> At 09:56 AM 2/9/2007 -0800, Titus Brown wrote: >On Fri, Feb 09, 2007 at 12:10:00PM -0500, Phillip J. Eby wrote: >-> Yeah, multiprocess should probably be set false there, and >-> multithreadedness should depend on whether the ThreadingTCPServer or >-> whatever it's called is mixed in. (HTTPServer does in fact support this, >-> but it's not tested in a WSGI context as far as I know.) > >OK. Err, do you want a patch? ;) Not really, but I'll take one anyway. :) From jtate at rpath.com Mon Mar 5 23:23:09 2007 From: jtate at rpath.com (Joseph Tate) Date: Mon, 5 Mar 2007 17:23:09 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45EC8952.1040703@colorstudy.com> References: <45E8EB97.6090805@zetaweb.com> <200703051254.57032.jtate@rpath.com> <45EC8952.1040703@colorstudy.com> Message-ID: <200703051723.09795.jtate@rpath.com> On Monday 05 March 2007 16:19:14 Ian Bicking wrote: > Joseph Tate wrote: > > I find that multiple files gives you a nice way to override defaults. As > > long as the files are read in a way that's predictable and documentable, > > and ultimately appear as if read from a single file (and possible > > displayable via some diagnostics link in an application). > > Allowing this sort of thing means that the application carries around a > complete config object of some sort, which I rather dislike -- it allows > for smart applications, but it makes it much harder to understand the > configuration and any possible side effects. If we resolve the > configuration down to something more limited (as the Paste Deploy entry > points do) you can't really reconstruct the config from there. > *Something* could still reconstruct the config (an alternate config > loader, via logs, via debug settings, etc), just not the application > itself. > > This is somewhat problematic for applications that have particularly > complex config requirements, or want to support self-configuration. The > best solution that I can think of with Paste Deploy in that case is to > just use the Paste Deploy configuration to point to the "real" > configuration. I agree. That's why my app has a /config link that spits out the "effective" configuration. The overridden config is a hard requirement, I'd love to hear alternative solutions. /etc/php.d, /etc/httpd/conf.d and that ilk come to mind as examples of this kind of thing. -- Joseph Tate Software Engineer rPath Inc. http://www.rpath.com/rbuilder/ (919) 851-3984 x2106 From jtate at rpath.com Tue Mar 6 03:46:46 2007 From: jtate at rpath.com (Joseph Tate) Date: Mon, 5 Mar 2007 21:46:46 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> References: <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> Message-ID: <200703052146.46699.jtate@rpath.com> On Monday 05 March 2007 16:38:51 Phillip J. Eby wrote: > At 10:02 AM 3/5/2007 -0500, Jim Fulton wrote: > >Entry points add *a* mechanism to make those objects a bit more > >discoverable. Arguably, specifying an application via: > >eggname#entrypointname doesn't provide much advantage over simply > >specifying the dotted path to an object in a module. > > Actually, it provides one very important strategic advantage that I don't > think has been mentioned in this conversation. A configuration format that > can specify project/version information can be used as a single-file > deployment spec for an easy_install wrapper or buildout-like tool. > > The advantage of this for virtual hosting providers in particular is > significant -- if they support the tool, they can support this one-file > deployment scheme. > > Personally, I don't care for the Paste Deploy syntax -- frankly it's almost > barbaric. :) But the concept of being able to specify stacks, routes, and > configuration in a plain text format that includes package information for > automated deployment is nonetheless an important one. > > A couple years back, I started writing a library to parse a more > sophisticated, Python-like syntax to do the same sorts of things, but only > got as far as the parser. > > One discussion was here: > > http://mail.python.org/pipermail/web-sig/2005-August/001714.html > > The basic idea behind the syntax was that assignments are like keyword > arguments, and non-assignment statements are positional arguments. > > I'm not altogether happy with that syntax either, however, as it has a > little too much "more than one way to do it", which is one reason I never > finished the implementation. There is a library that parses it (and does > other general-purpose Python-like DSL parsing) at: > > ViewSVN: http://svn.eby-sarna.com/SCALE/ > Checkout: svn://svn.eby-sarna.com/svnroot/SCALE/ > Docs: > http://peak.telecommunity.com/DevCenter/scale.dsl#parsing-declarations > > Anyway, all that aside, I think it would be fantastic if we could come up > with some "universal file format" for single-file configuration and > deployment of applications (including auto-install of all needed eggs), > that could get stdlib support and ultimately hosting company support. This > would actually give us a leg up on even PHP for ease-of-deployment. Doesn't setuptools already give this? easy_install foo.app.egg will install all of the needed eggs if the dependencies are properly listed. > So, while a single content format would be nice, we don't even need that in > order to get a raw deployment system standard. Perhaps I should build this > hypothetical pydeploy tool into setuptools 0.7? I don't see there being a lot of demand for this. The use case I'm considering is the end user developer or administrator deploying one or more delivered pyhon web applications to a production environment (self hosted, colo-hosted, or leased server). I think that except for where you have multiple servers behind a load balancer or something, this is a one time operation (barring failure cases, etc). Administrators already script this kind of thing using shell. Also, in any "enterprise" environment that I'm familiar with, the automatically download and install software mechanism wouldn't fly. Administrators want to know everything that goes on a system, and want the software managed through their patch/package management system. Philosophical discussions on whether that's good or not seem to be irrelevant. Those using $4.95 hosting plans are only setting up one server, and will need something custom to their installation anyway, so "pydeploy" won't help them either. They'll be trying to install trac, some blogging software and then an arbitrary image gallery, et. al., but won't have the same selections as another $4.95 hosting customer. This is the key problem we're trying to solve. I consider the packaging and delivery problem solved[1], or at least out of the scope of this problem. -- Joseph Tate Software Engineer rPath Inc. http://www.rpath.com/rbuilder/ (919) 851-3984 x2106 [1] Good enough for most things but better support for stuff outside the egg is needed: config files (so that the user can tweak them), locale data (or maybe a pkg_resources wrapper for gettext that loads that data from the egg). From pje at telecommunity.com Tue Mar 6 04:25:27 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 05 Mar 2007 22:25:27 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <200703052146.46699.jtate@rpath.com> References: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070305222325.02812190@sparrow.telecommunity.com> At 09:46 PM 3/5/2007 -0500, Joseph Tate wrote: >Those using $4.95 hosting plans are only setting up one server, and will need >something custom to their installation anyway, so "pydeploy" won't help them >either. They'll be trying to install trac, some blogging software and then >an arbitrary image gallery, et. al., but won't have the same selections as >another $4.95 hosting customer. This is the key problem we're trying to >solve. I was saying that they would drop in a single file for trac, a single file for a blog, one for an image gallery, etc. That's a heck of a big deployment advantage, actually. I wasn't talking about configuring a "server" -- I was talking about deploying *applications*. From chris at simplistix.co.uk Tue Mar 6 20:59:54 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 06 Mar 2007 19:59:54 +0000 Subject: [Web-SIG] The importance of deploying Python-based web apps on Windows (was: Re: [Proposal] "website" and first-level conf) In-Reply-To: References: Message-ID: <45EDC83A.1050806@simplistix.co.uk> Sidnei da Silva wrote: > I seriously hope you are kidding. > > Unfortunately that's not possible. A lot of people, specially when > evaluating open-source projects, have their first contact with the > software through the Windows platform. To quote some numbers, the > Plone Installer for Windows has roughly 3x more downloads than any of > the second most download package [1]. > > Now, I see clearly two options for open-source projects: have a > Windows story and increase your downloads by X%, where X can be a > number between 50-300 *wink*, or not have a Windows story and relying > on the *nix crowd to be the sole consumers of your software. > > When you talk to a big organization that is already deploying their > applications on the Windows platform what story you want to tell them? > 'Oh, and by the way, all your investment on Windows software, you will > have to throw all that away if you want to use our software'. Good > luck with that. > > I think that it's pretty important that Python-based web apps have as > good of a story on Windows as it has in other fields (pywin32 comes to > mind) but feel free to disagree. > > Sorry for the rant. No, and this really deserves saying again... Windows isn't going to vanish any time soon, and we're not going to help it vanish any quicker by head-in-sand'ing it's existence... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Tue Mar 6 20:56:34 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 06 Mar 2007 19:56:34 +0000 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> Message-ID: <45EDC772.3090803@simplistix.co.uk> Jim Fulton wrote: > On Mar 3, 2007, at 11:27 PM, Chad Whitacre wrote: > ... >> Now, Jim: it looks like Zope still uses a Unix-y userland for >> INSTANCE_HOME, yes? > > Yes, but I hate it. At Zope Corporation, We're moving away from it > for a number of reasons. I actually like it a lot, still, and I haven't heard compelling arguments, for me, for other things... The big plus point for me is that everything needed for one deployment is in one folder. I agree with Jim that in large-scale deployments, as ZC does, there may not be the need to worry about this, but I think python is probably in use in a lot more projects where there's more than one project per machine, and you want to be able to totally isolate them from each other. INSTANCE_HOME in Zope 2 felt like the right balance for me... > For development, it adds structure that isn't needed. A Zope > instance really only needs a few files. Trying to minic some > notional unix layout just adds pointless structure. It's kindof self documenting though: /etc -> config /bin -> scripts /var -> data /log -> logs I like that consistency, regardless of its origins... > The traditional complex Zope instance file layout lead to the use of > an instance "skeleton" to deal with all of the files, which led, in > turn, to a copy and hack style of configuration customization that is > inflexible and encourages cruft. I think the Zope 3 skeletons went the wrong way. The skeletons work, but where they only contain config that's specific to that instance. Zope 3's notions of putting python scripts (and non-trivial ones at that!) and the like into the instance home made me shudder... > For production deployments, we (Zope Corporation) install files into > the *real* Unix tree where site administrators want them. Not everyone runs on unix. Having a standard layout that fits into a folder works cross platform to a large extent. > Keeping the number of files used by an application minimal makes it > easier deal with the different needs of development and deployment > and makes it easier, at least for me, to deal with different > configurations. Yep. > I'll note that I find lib/python especially silly. Agreed. lib would be fine, mindyou, so would Products ;-) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From ianb at colorstudy.com Wed Mar 7 03:08:46 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 06 Mar 2007 20:08:46 -0600 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware Message-ID: <45EE1EAE.50705@colorstudy.com> Posted here: http://wsgi.org/wsgi/Specifications/avoiding_serialization Text copied below for discussion: :Title: Avoiding Serialization When Stacking Middleware :Author: Ian Bicking :Discussions-To: Python Web-SIG :Status: Proposed :Created: 06-03-2007 .. contents:: Abstract -------- This proposal gives a strategy for avoiding unnecessary serialization and deserialization of request and response bodies. It does so by attaching attributes to ``wsgi.input`` and the ``app_iter``, as well as a new environment key ``x-wsgiorg.want_parsed_response``. Rationale --------- Output-transforming middleware often has to parse the upstream content, transform it, then serialize it back to a string for output. The original output may have already been in the parsed form that the middleware wanted. Or there may be more middleware that does similar transformations on the same kind of objects. The same things apply to the parsing of ``wsgi.input``, specifically parsing form data. A similar strategy is presented to avoid unnecessarily reparsing that data. Specification ------------- WSGI applications (or middleware) can return an app_iter that not only serializes the output, but also has extra attributes. An attribute is given here, ``app_iter.x_wsgiorg_parsed_response`` which is a function/method that takes one argument, the "type" of object that you want to receive. It may return that type of object, or None (meaning it cannot produce that type of object). Consumers should fall back on normal parsing of the response if the method does not exist, or returns None. Similarly the ``environ['wsgi.input']`` object may have the same method, with the same meaning. WSGI applications that want to lazily serialize their output have a problem: they probably cannot calculate ``Content-Length`` without doing the actual serialization. Browsers typically want to know about ``Content-Length``, but WSGI middleware seldom cares, since it just can get the content from app_iter regardless of its length. WSGI middleware that will transform the output can set ``environ['x-wsgiorg.want_parsed_response'] = True`` to give this hint to the application. Applications are thus encouraged to only lazily serialize their output when that key is present and true. (There is no equivalent concept for ``wsgi.input``.) The object returned by ``.x_wsgiorg_parsed_response()`` may be modified in-place by the WSGI middleware using that object. Producers should make a copy if they do not want consumers modifying the object. Example -------- Two examples are provided: one for output, and one for input. The output transformation parses the page with ``lxml.etree.HTML`` (from the `lxml `_ library) and replaces all ```` tags with ```` tags. First we show the middleware:: import lxml.etree class EmTagMiddleware(object): def __init__(self, app): self.app = app def __call__(self, environ, start_response): parent_wants_parsed = environ.get('x-wsgiorg.want_parsed_response') environ['x-wsgiorg.want_parsed_response'] = True written_output = [] captured_headers = [] def repl_start_response(status, headers, exc_info=None): if exc_info: raise exc_info[0], exc_info[1], exc_info[2] captured_headers[:] = [status, headers] return written_output.append app_iter = self.app(environ, repl_start_response) parsed = None if captured_headers and not written_output: method = getattr(app_iter, 'x_wsgiorg_parsed_response', None) if method: parsed = method(lxml.etree._Element) if parsed is None: # Have to manually parse, because: # a) start_response was called lazily # b) the start_response writer was used # c) app_iter.x_wsgiorg_parsed_response didn't exist # d) that method returned None try: for item in app_iter: written_output.append(item) finally: if hasattr(app_iter, 'close'): app_iter.close() parsed = self.parse_body(''.join(written_output)) status, headers = captured_headers new_body = self.transform_body(parsed) for i in range(len(headers)): if headers[i][0].lower() == 'content-length': del headers[i] break if parent_wants_parsed: new_app_iter = self.make_app_iter(new_body) else: serialized_body = serialize(new_body) headers.append(('Content-Length', str(len(serialized_body)))) new_app_iter = [serialized_body] return new_app_iter def parse_body(self, body): return lxml.etree.HTML(body) def transform_body(self, root): for el in root.xpath('//i'): el.tag = 'em' return root def make_app_iter(self, body): return LazyLXML(body) def serialize(element): return lxml.etree.tostring(element) class LazyLXML(object): def __init__(self, body): self.body = body self.have_yielded = False def __iter__(self): return self def next(self): if self.have_yielded: raise StopIteration self.have_yielded = True return serialize(self.body) def x_wsgiorg_parsed_response(self, type): if type is lxml.etree._Element: return self.body return None Here's a simpler example for parsing normal form inputs in ``wsgi.input``:: import cgi import urllib from cStringIO import StringIO def parse_form(environ): content_type = environ.get('CONTENT_TYPE', '') assert content_type in ['application/x-www-form-urlencoded', 'multipart/form-data'] wsgi_input = environ['wsgi.input'] method = getattr(wsgi_input, 'x_wsgiorg_parsed_response', None) if method: parsed = method(cgi.FieldStorage) if parsed is not None: return parsed form = cgi.FieldStorage(fp=wsgi_input, environ=environ, keep_blank_values=True) environ['wsgi.input'] = FakeFormInput(form) return form class FakeFormInput(object): def __init__(self, form): self.form = form self.serialized = None def x_wsgiorg_parsed_response(self, type): if type is cgi.FieldStorage: return self.form return None def read(self): if self.serialized is None: self._serialize() return self.serialized.read() def readline(self, *args): if self.serialized is None: self._serialize() return self.serialized.readline(*args) def readlines(self, *args): if self.serialized is None: self._serialize() return self.serialized.readlines(*args) def __iter__(self): if self.serialized is None: self._serialize() return iter(self.serialized) def _serialize(self): # XXX: Doesn't deal with file uploads, and multipart/form-data generally data = urllib.urlencode(self.form.list, True) self.serialized = StringIO(data) Problems -------- Obviously the code is not simple, but this is the nature of WSGI output-transforming middleware. Ideally a framework of some sort would be used to construct this kind of middleware. Something that replaces ``wsgi.input`` (like the example) may change the ``CONTENT_LENGTH`` of the request; normalization alone may change the length, even if the data is the same (e.g., there are multiple ways to urlencode a string). However, there's no way without actually serializing to determine the proper length. Ideally requests like this should allow simply reading to the end of the object, without needing a ``CONTENT_LENGTH`` restriction (this is not true for socket objects). Ideally something like ``CONTENT_LENGTH="-1"`` would indicate this situation (simply a missing ``CONTENT_LENGTH`` generally means ``0``). Another option is to set it to 1 and simply return the entire serialized response all at once. ``cgi.FieldStorage`` actually protects against this. Or set it to a very very large value, and allow reading past the end (returning ``""``). This is likely to work with most consumers. I'm not sure what effect -1 will have on different code. Other Possibilities ------------------- * You could simply parse everything ever time. * You could pass data through callbacks in the environment (but this can break non-aware middleware). * You can make custom methods and keys for each case. * You can use something other than WSGI. I think this specification offers advantages over all these options. Open Issues ----------- Should "type" be the class object? A string describing the type? Things like ``lxml.etree._Element`` are a little unclean, since the *actual* class isn't a public object (only the factory function ``lxml.etree.Element``). Also, there are occasionally times when multiple classes implement the same interface. The boolean ``environ['x-wsgiorg.want_parsed_response']`` doesn't really give any idea of what *kind* of object you want. This is actually something of a problem, because sometimes it's impossible to give that kind of object. For instance, if you want to transform images you might want the PIL object for the image. But if the response is HTML there's no way to give this type. Similarly if you are transforming HTML then images don't mean anything to you, and you probably *do* want them to come out as normal. And potentially *both* a image transformer and an HTML transformer are in the stack. Should that key actually hold a list of types that are of interest? ``x_wsgiorg_parsed_response`` isn't a very good name for the method on ``wsgi.input``, as it's not a response. From pje at telecommunity.com Wed Mar 7 03:52:20 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 06 Mar 2007 21:52:20 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <45EE1EAE.50705@colorstudy.com> Message-ID: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> At 08:08 PM 3/6/2007 -0600, Ian Bicking wrote: >Posted here: http://wsgi.org/wsgi/Specifications/avoiding_serialization > >Text copied below for discussion: > > >:Title: Avoiding Serialization When Stacking Middleware >:Author: Ian Bicking >:Discussions-To: Python Web-SIG >:Status: Proposed >:Created: 06-03-2007 > >.. contents:: > >Abstract >-------- > >This proposal gives a strategy for avoiding unnecessary serialization >and deserialization of request and response bodies. It does so by >attaching attributes to ``wsgi.input`` and the ``app_iter``, as well as >a new environment key ``x-wsgiorg.want_parsed_response``. > >Rationale >--------- > >Output-transforming middleware often has to parse the upstream content, >transform it, then serialize it back to a string for output. The >original output may have already been in the parsed form that the >middleware wanted. Or there may be more middleware that does similar >transformations on the same kind of objects. HTTP already includes a mechanism for specifying what types are accepted by a content consumer: the "Accept" header. You can always add other values to it to indicate the parsed values you can accept. Of course, this doesn't really work well with WSGI - you want the result to actually *be* WSGI... so you can use the WSGI way of doing this, which is to have a standard wrapper for the specific content type you want to use. The wrapper (as with the wsgi "file wrapper") simply puts a WSGI face on a non-WSGI result body, converting it to an iterator of strings, and holding other attributes known to the middleware or other application object. This could be implemented as an environ key containing a mapping from types to wrapper functions. Middleware that wants a type just copies the mapping and overwrites any entries it cares about. Applications that want to return a non-serialized result just look up the type (using __mro__ order) to find an applicable wrapper. Notice that this approach doesn't require any special protocol for these wrappers -- just WSGI. It's simpler to specify, and simpler to implement than what you propose, while addressing some of the open issues. Yes, it does have some problems with interface vs. implementation. ISTM that trying to solve that problem is effectively asking to revive or reinvent PEP 246, however. But we could explicitly allow the use of type names instead of the actual types. >The same things apply to the parsing of ``wsgi.input``, specifically >parsing form data. A similar strategy is presented to avoid >unnecessarily reparsing that data. I would rather offer an optional 'get_file_storage()' method or some such as a blessed WSGI extension, than have such an open-ended "get whatever you want from the input object" concept floating around. A strategy which reinvents half of PEP 246 (the *old* PEP 246, before it became almost as complicated as WSGI) seems like overkill to me. >Obviously the code is not simple, but this is the nature of WSGI >output-transforming middleware. Something I'd like to fix in WSGI 2.0, by getting rid of both "start_response" and "write", but that's a discussion for another time. >Other Possibilities >------------------- > >* You could simply parse everything ever time. >* You could pass data through callbacks in the environment (but this can >break non-aware middleware). >* You can make custom methods and keys for each case. >* You can use something other than WSGI. And you can use the established WSGI method for adding semantics to a response, using a middleware-supplied wrapper. I think this is actually the best alternative. In truth, it could be as simple as using the class's fully-qualified name as an environ key (perhaps with a prefix or suffix), with the value being a wrapper for objects implementing that protocol. No x-foobar-wsgiorg-whatchamacallit cruft needed. And, it's lightweight enough of a concept to be expressed as a simple "best practice" design pattern. From fumanchu at amor.org Wed Mar 7 04:23:17 2007 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 6 Mar 2007 19:23:17 -0800 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware References: <45EE1EAE.50705@colorstudy.com> Message-ID: <435DF58A933BA74397B42CDEB8145A86224D55@ex9.hostedexchange.local> Ian Bicking wrote: > This proposal gives a strategy for avoiding unnecessary > serialization and deserialization of request and response > bodies. It does so by attaching attributes to ``wsgi.input`` > and the ``app_iter``, as well as a new environment key > ``x-wsgiorg.want_parsed_response``. > > [snip] > > for item in app_iter: > written_output.append(item) This bit of the example, at least, is not compliant with PEP 333: http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries "To put this requirement another way, a middleware component must yield at least one value each time its underlying application yields a value. If the middleware cannot yield any other value, it must yield an empty string." I suspect rewriting the example to conform to PEP 333 will make this proposal much more complex? Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20070306/28095a2d/attachment-0001.htm From ianb at colorstudy.com Wed Mar 7 04:43:43 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 06 Mar 2007 21:43:43 -0600 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> References: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> Message-ID: <45EE34EF.9030602@colorstudy.com> Phillip J. Eby wrote: > At 08:08 PM 3/6/2007 -0600, Ian Bicking wrote: >> Posted here: http://wsgi.org/wsgi/Specifications/avoiding_serialization >> >> Text copied below for discussion: >> >> >> :Title: Avoiding Serialization When Stacking Middleware >> :Author: Ian Bicking >> :Discussions-To: Python Web-SIG >> :Status: Proposed >> :Created: 06-03-2007 >> >> .. contents:: >> >> Abstract >> -------- >> >> This proposal gives a strategy for avoiding unnecessary serialization >> and deserialization of request and response bodies. It does so by >> attaching attributes to ``wsgi.input`` and the ``app_iter``, as well as >> a new environment key ``x-wsgiorg.want_parsed_response``. >> >> Rationale >> --------- >> >> Output-transforming middleware often has to parse the upstream content, >> transform it, then serialize it back to a string for output. The >> original output may have already been in the parsed form that the >> middleware wanted. Or there may be more middleware that does similar >> transformations on the same kind of objects. > > HTTP already includes a mechanism for specifying what types are accepted > by a content consumer: the "Accept" header. You can always add other > values to it to indicate the parsed values you can accept. > > Of course, this doesn't really work well with WSGI - you want the result > to actually *be* WSGI... so you can use the WSGI way of doing this, > which is to have a standard wrapper for the specific content type you > want to use. Yeah, using Accept is clever, but not really accurate, since if you serialize the WSGI request to HTTP the addition no longer makes sense. > The wrapper (as with the wsgi "file wrapper") simply puts a WSGI face on > a non-WSGI result body, converting it to an iterator of strings, and > holding other attributes known to the middleware or other application > object. That just calls for a series of ad hoc techniques, basically, where each object type results in a new key in the environment and a new ad hoc specification to be made (e.g., wsgi.file_wrapper takes a block size, which is specific only to that case). > This could be implemented as an environ key containing a mapping from > types to wrapper functions. Middleware that wants a type just copies > the mapping and overwrites any entries it cares about. Applications > that want to return a non-serialized result just look up the type (using > __mro__ order) to find an applicable wrapper. OK, the dict would avoid multiple different kinds of keys, and presumably they'd all have the same signature. Block size doesn't really make any sense to me as a common parameter. Content type should be a common parameter, as something like an lxml object can be serialized as either XML or HTML. I don't think any response headers are likely to effect the serialization... though with my specification that remains an application concern, so it doesn't have to be resolved in the specification. I hadn't really thought about MRO, though generally I don't trust inheritance to be meaningful anyway -- I feel like I'd be more likely to a switch on the type than test inheritance. > Notice that this approach doesn't require any special protocol for these > wrappers -- just WSGI. It's simpler to specify, and simpler to > implement than what you propose, while addressing some of the open issues. The specification isn't particularly long or complicated, IMHO. The implementation is complicated mostly for reasons unrelated to the specification -- any output-transforming middleware will be similarly complicated. > Yes, it does have some problems with interface vs. implementation. ISTM > that trying to solve that problem is effectively asking to revive or > reinvent PEP 246, however. But we could explicitly allow the use of > type names instead of the actual types. When playing with implementation I used type names, and actually I rather prefer them, but it's not always clear what name to use. For instance, "lxml", "lxml.etree", "lxml.etree.Element", and "lxml.etree._Element" all are reasonable names. Or "ElementTree", "ElementTree.Element", "ElementTree._Element", "xml.etree", "xml.etree.Element", and "xml.etree._Element". Or even something like "IElement" could make sense in some context (e.g., what if you can accept the overlapping interfaces of both lxml and ElementTree?) At least the actual type object seems easy enough. OTOH, there are actually cases when I'd like to say that I could accept a certain type without having to import the type. E.g., if I wanted to do an XSLT transformation, I *could* support several kinds of objects without requiring any of them (e.g., lxml, 4DOM, and Genshi Markup). >> The same things apply to the parsing of ``wsgi.input``, specifically >> parsing form data. A similar strategy is presented to avoid >> unnecessarily reparsing that data. > > I would rather offer an optional 'get_file_storage()' method or some > such as a blessed WSGI extension, than have such an open-ended "get > whatever you want from the input object" concept floating around. A > strategy which reinvents half of PEP 246 (the *old* PEP 246, before it > became almost as complicated as WSGI) seems like overkill to me. I don't really understand what you are proposing. This part addresses the same issues as presented in http://wsgi.org/wsgi/Specifications/handling_post_forms I really don't *want* to write every wsgi.input to a temporary file just because someone else *might* want to reparse the input. I'd much rather do it lazily, as 99% of the time reparsing won't happen. >> Obviously the code is not simple, but this is the nature of WSGI >> output-transforming middleware. > > Something I'd like to fix in WSGI 2.0, by getting rid of both > "start_response" and "write", but that's a discussion for another time. Yeah, that'd be nice, but another discussion for another time. >> Other Possibilities >> ------------------- >> >> * You could simply parse everything ever time. >> * You could pass data through callbacks in the environment (but this can >> break non-aware middleware). >> * You can make custom methods and keys for each case. >> * You can use something other than WSGI. > > And you can use the established WSGI method for adding semantics to a > response, using a middleware-supplied wrapper. I think this is actually > the best alternative. I really don't understand the advantage. > In truth, it could be as simple as using the class's fully-qualified > name as an environ key (perhaps with a prefix or suffix), with the value > being a wrapper for objects implementing that protocol. No > x-foobar-wsgiorg-whatchamacallit cruft needed. > > And, it's lightweight enough of a concept to be expressed as a simple > "best practice" design pattern. Best practice is fine, though of course still needs to be documented, as this is hardly a practice that people would naturally think about or implement. But I don't really think that practice would be any simpler or easier to describe if done completely. In fact, I think it would take exactly the same amount of space to describe. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From pje at telecommunity.com Wed Mar 7 05:51:39 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 06 Mar 2007 23:51:39 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <45EE34EF.9030602@colorstudy.com> References: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> At 09:43 PM 3/6/2007 -0600, Ian Bicking wrote: >Phillip J. Eby wrote: >>The wrapper (as with the wsgi "file wrapper") simply puts a WSGI face on >>a non-WSGI result body, converting it to an iterator of strings, and >>holding other attributes known to the middleware or other application object. > >That just calls for a series of ad hoc techniques, As is appropriate for a "series of tubes". :) > basically, where each object type results in a new key in the > environment and a new ad hoc specification to be made (e.g., > wsgi.file_wrapper takes a block size, which is specific only to that case). Right. I'm specifically saying that a collection of individual specifications is much *better* than a single overarching specification generalized from a single example. Single use cases make bad general specs. >OK, the dict would avoid multiple different kinds of keys, and presumably >they'd all have the same signature. Block size doesn't really make any >sense to me as a common parameter. Content type should be a common >parameter, as something like an lxml object can be serialized as either >XML or HTML. I don't think any response headers are likely to effect the >serialization... though with my specification that remains an application >concern, so it doesn't have to be resolved in the specification. Please don't keep trying to generalize this. They're called "specific-ations", not "general-izations". :) >>Notice that this approach doesn't require any special protocol for these >>wrappers -- just WSGI. It's simpler to specify, and simpler to implement >>than what you propose, while addressing some of the open issues. > >The specification isn't particularly long or complicated, IMHO. That's because it doesn't address any of the real issues -- they're all deferred to your "open issues" section. That's why I don't think having the specification adds any value over highlighting the existing WSGI pattern for extending the response (i.e. server-supplied iterator-wrappers). >When playing with implementation I used type names, and actually I rather >prefer them, but it's not always clear what name to use. For instance, >"lxml", "lxml.etree", "lxml.etree.Element", and "lxml.etree._Element" all >are reasonable names. Or "ElementTree", "ElementTree.Element", >"ElementTree._Element", "xml.etree", "xml.etree.Element", and >"xml.etree._Element". Or even something like "IElement" could make sense >in some context (e.g., what if you can accept the overlapping interfaces >of both lxml and ElementTree?) > >At least the actual type object seems easy enough. OTOH, there are >actually cases when I'd like to say that I could accept a certain type >without having to import the type. E.g., if I wanted to do an XSLT >transformation, I *could* support several kinds of objects without >requiring any of them (e.g., lxml, 4DOM, and Genshi Markup). These problems all stem from premature generalization. It's a trivial problem to fix, however, if you are trying to share one particular content type: just pick a key and use it! Libraries such as wsgiref can support this pattern by providing a utility like "wrap_content(environ, content, default_wrapper, *keys)" function that looks up "keys" to find a wrapper to use in place of the default_wrapper. >>>The same things apply to the parsing of ``wsgi.input``, specifically >>>parsing form data. A similar strategy is presented to avoid >>>unnecessarily reparsing that data. >>I would rather offer an optional 'get_file_storage()' method or some such >>as a blessed WSGI extension, than have such an open-ended "get whatever >>you want from the input object" concept floating around. A strategy >>which reinvents half of PEP 246 (the *old* PEP 246, before it became >>almost as complicated as WSGI) seems like overkill to me. > >I don't really understand what you are proposing. That wsgi.input be allowed to have a 'get_file_storage()' method that can be called by applications, and that calling it means the input stream must not have been read and will no longer be readable. >This part addresses the same issues as presented in >http://wsgi.org/wsgi/Specifications/handling_post_forms > >I really don't *want* to write every wsgi.input to a temporary file just >because someone else *might* want to reparse the input. I'd much rather >do it lazily, as 99% of the time reparsing won't happen. I don't understand your complaint, as it seems unrelated to what I propose. >>>Other Possibilities >>>------------------- >>> >>>* You could simply parse everything ever time. >>>* You could pass data through callbacks in the environment (but this can >>>break non-aware middleware). >>>* You can make custom methods and keys for each case. >>>* You can use something other than WSGI. >>And you can use the established WSGI method for adding semantics to a >>response, using a middleware-supplied wrapper. I think this is actually >>the best alternative. > >I really don't understand the advantage. It's simple: *specifications are a liability in the general case*. They are supposed to be the record of negotiations between people who need to co-operate, not an attempt to solve all possible problems. So, if your spec is only about how relatively tight-coupled WFC's (WSGI framework components) talk to each other, it seems more properly the business of a web framework, not WSGI. However, it *is* WSGI (wsgi-onic?) for the authors of certain components to get together and say, "hey let's agree on this wrapper protocol"... or better yet, a wrapper *implementation*. This is way way better than having another spec. Every godforsaken new spec attached to WSGI just makes the whole thing seem way too complicated. In retrospect, I wish I hadn't supported some of the options and doodads and whatnots that are in WSGI today. If I had it to do over, WSGI would be a lot simpler. However, it's not too late to stop adding new cruft -- and I consider the idea of reinventing PEP 246 inside of WSGI to be cruft of a most horrible kind. >Best practice is fine, though of course still needs to be documented, as >this is hardly a practice that people would naturally think about or implement. Well, it's in PEP 333. > But I don't really think that practice would be any simpler or easier > to describe if done completely. In fact, I think it would take exactly > the same amount of space to describe. Even if it *did*, it'd still be better. However, since it's not a spec, it can be presented informally. Here's an example: "If you want to give applications underneath your middleware a chance to return rich responses (i.e., objects instead of strings), follow the pattern used for the WSGI 'file wrapper' object. That is, have your server or middleware add an environ key with a wrapper API that can convert the richer objects you're expecting into a standard WSGI iterator. Then, your server can simply inspect the iterators it receives to see if they are instances of your wrapper type, and pull out the objects you want. In this way, if there is middleware between you and the application returning the rich response that modifies the response body, you will receive an iterator of a different type, which you can process in the usual way. However, if you receive an instance of your wrapper type, you will know that you can access the rich data directly." Now, can you expand this into more of a tutorial, give more hints and so on? Absolutely. It'd be a great idea to. But the basic idea is simple and doesn't require rigorous definitions -- it just needs people to publish what keys they're using and the *specifications thereof*. What you're trying to specify is effectively a *meta*-specification: much more difficult to do well, and not nearly as useful to have in this case. From jim at zope.com Wed Mar 7 10:53:26 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 7 Mar 2007 04:53:26 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <200703051225.10896.jtate@rpath.com> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> <200703051225.10896.jtate@rpath.com> Message-ID: <001FA4CA-1923-481C-8363-8381B7B7D6CD@zope.com> On Mar 5, 2007, at 12:25 PM, Joseph Tate wrote: > On Saturday 03 March 2007 11:08:24 Jim Fulton wrote: >> >> Anyway, I share this for your consideration. There are probably >> better tools out there than zdaemon and supervisor2, but I'm not >> aware of them. :) I'm curious what other people have found or use. > > ll.daemon (http://www.livinglogic.de/Python/daemon/index.html) > seems to be a > straightforward and very simple library for core daemon functionality. Ah, this was the one mentioned in the open-space talk. This looks very similar to a much earlier version of zdaemon. A disadvantage I see with it is that it requires modifying a Python application to use it. We moved away from that model with zdaemon, which can wrap any application. We use it to make the spread daemon sane for example. Does ll.daemon provide a monitoring process that restarts an application process if it exits abnormally? > > Daemontools isn't very well respected by the SysV style initscript > crowd, and > vice versa. That's an external non python dependency, and not > commonly > available. Certainly not available on Windows. Yes, I've heard similar things. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Mar 7 11:04:47 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 7 Mar 2007 05:04:47 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <200703051225.10896.jtate@rpath.com> References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com> <200703051225.10896.jtate@rpath.com> Message-ID: <36991237-260A-40C2-BFB4-23B201417E61@zope.com> On Mar 5, 2007, at 12:25 PM, Joseph Tate wrote: ... > ll.daemon (http://www.livinglogic.de/Python/daemon/index.html) > seems to be a > straightforward and very simple library for core daemon functionality. ... > I have written my own daemon base class (Pretty restrictive license > [reciprocal], but I'm sure I could get it loosened). > http://hg.rpath.com/raa-1.1?f=9ac380d082f4;file=raa/service/ > daemon.py I'm not > married to it though, so would be happy to spin it out and remove > the conary > requirements, or just junk it. Are either of these useful on Windows? IOW, do they map to services on windows? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Mar 7 11:16:36 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 7 Mar 2007 05:16:36 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> Message-ID: <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> On Mar 5, 2007, at 1:38 PM, Robert Brewer wrote: ... > What several people have asked for is the ability to combine > applications (and WSGI components) from a variety of frameworks into a > single "website". What I'm proposing is that we standardize on a > set of > topics/channels/events/signals that are "site-wide" events, like > start, > stop, restart and graceful. If we collaborated on a tool to manage > those, we could potentially make the codebases of each project > smaller, > not just by removing the event manager, but by collaborating on a > set of > standard event handlers, one of which could be a "daemonize me" > handler. Agreed. > > What we have now: > > CherryPy Zope Django > -------- ------ ------- > ??? events signals > | | | > autoreload ??? autoreload > | | | > engine zdrun ??? > | | | > ??? zdctl ??? > > What we could have instead: > > webctl modpython_gateway > | / > ------------ pywebd ------------ > / | \ > -------- ------ ------ > CherryPy Zope Django > > > ...where the "pywebd" module: > > 1. Composes the WSGI stack (provides a library to do so at least), > 2. Notifies frameworks of site-wide events (like start, stop, restart > and graceful), > 3. Provides plugins that frameworks can "notify"; for example, adding > files to an autoreload plugin. This sounds great to me. >> I think your "sitewide container" is the main program that loads >> the WSGI components. This might be Apache, if mod_python is >> used, or some Python script/program. > > Apache itself is not going to be the chunk of code that loads the WSGI > components. In my head, a modpython_gateway module (or something > similar) would ask pywebd to do that. Right. >> I was discussing a tool that managed the main program in the >> latter case. Something that started and restarted it, provided >> status information, helped it to run as a proper daemon and so on. > > Sure, something like zdctl? But zdctl doesn't do the actual fork, > zdrun > does...so what does "help run as a proper daemon" mean? (zdrun is really an internal implementation detail of zdaemon. The latest version of zdaemon hides this much more than earlier versions. ) Logically, zdctl runs zdrun, which forks and execs the application process. (In the latest version, there is just one script, zdaemon, that loads either the zdctl or zdrun entry point when it is run.) zdrun does the deamonizing steps: - disconnecting from the controlling terminal, and - changing to a different user if requested before forking and execing the application. I see a division of responsibilities between: - A facility for managing an application process - start/stop/status/etc - passing environment variables, providing some logging support if necessary (especially for applications that spew to standard err/out). - Optionally providing other daemon behaviors like disconnecting from the controlling terminal, changing user, etc. zdaemon provides this service on behalf of applications. - A main program that provides common application-level services like the ones you describe above. - Optionally providing other daemon behaviors like disconnecting from the controlling terminal, changing user, etc. ll.deamon provides some of these services within an application. A question is whether to provide the daemonizing support in the main program or in the controlling program. Note that in answering this question, we probably need to have an idea how this will work on windows. If Unix-specific daemonizing code is in the main application, then the application won't be portable. Of course, if the main program is generic, it might not be a big deal to have separate versions for Windows and Unix. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Mar 7 11:34:15 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 7 Mar 2007 05:34:15 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> Message-ID: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote: ... > Personally, I don't care for the Paste Deploy syntax -- frankly > it's almost barbaric. :) I don't mean to pick on you, but I really *hate* comments like this. I don't like softer forms like "complicated" or even "makes me uneasy". It would be far more helpful if you provides specific criticism. I'd appreciate it if we would all just ignore statements like this and, preferably, stop making them. > But the concept of being able to specify stacks, routes, and > configuration in a plain text format that includes package > information for automated deployment is nonetheless an important one. Yes > A couple years back, I started writing a library to parse a more > sophisticated, Python-like syntax to do the same sorts of things, > but only got as far as the parser. A few years back, we created a library to parse more sophisticated apache-like syntax and I wish we hadn't. The ini/config format is pretty standard and, IMO, really quite adequate. I'm convinced that we don't really need another configuration format, at least not at this level. ... > Anyway, all that aside, I think it would be fantastic if we could > come up with some "universal file format" for single-file > configuration and deployment of applications (including auto- > install of all needed eggs), Me too. That's one of the reasons I created zc.buildout. But that's a big commitment. With buildout, I can use a single configuration file and have recipes that generate lots of little configuration files as necessary, for lots of applications like databases, ldap servers, and web applications that will never use a single configuration file on their own. I'd be happy if we could tackle a simple configuration format that handled the kinds of things Paste Deployment handles now and maybe a little more. I'll get my cake and eat it too with buildout. :) > that could get stdlib support and ultimately hosting company > support. This would actually give us a leg up on even PHP for ease- > of-deployment. Aside from the universal configuration file issue, I think this would be a terrific thing for us to focus on. Something I hear a lot is how much easier PHP applications are to deploy to hosting providers. I would *love* it is Python had a similar story, even if only for smaller applications. I'd love to get some input who know a lot about what makes deploying PHP apps so easy. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Mar 7 11:37:35 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 7 Mar 2007 05:37:35 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <200703051723.09795.jtate@rpath.com> References: <45E8EB97.6090805@zetaweb.com> <200703051254.57032.jtate@rpath.com> <45EC8952.1040703@colorstudy.com> <200703051723.09795.jtate@rpath.com> Message-ID: <153051AE-13FF-4D1F-860D-36F94A97A77D@zope.com> On Mar 5, 2007, at 5:23 PM, Joseph Tate wrote: > On Monday 05 March 2007 16:19:14 Ian Bicking wrote: >> Joseph Tate wrote: >>> I find that multiple files gives you a nice way to override >>> defaults. As >>> long as the files are read in a way that's predictable and >>> documentable, >>> and ultimately appear as if read from a single file (and possible >>> displayable via some diagnostics link in an application). >> >> Allowing this sort of thing means that the application carries >> around a >> complete config object of some sort, which I rather dislike -- it >> allows >> for smart applications, but it makes it much harder to understand the >> configuration and any possible side effects. If we resolve the >> configuration down to something more limited (as the Paste Deploy >> entry >> points do) you can't really reconstruct the config from there. >> *Something* could still reconstruct the config (an alternate config >> loader, via logs, via debug settings, etc), just not the application >> itself. >> >> This is somewhat problematic for applications that have particularly >> complex config requirements, or want to support self- >> configuration. The >> best solution that I can think of with Paste Deploy in that case >> is to >> just use the Paste Deploy configuration to point to the "real" >> configuration. > > I agree. That's why my app has a /config link that spits out the > "effective" > configuration. The overridden config is a hard requirement, I'd > love to hear > alternative solutions. /etc/php.d, /etc/httpd/conf.d and that ilk > come to > mind as examples of this kind of thing. FWIW, zc.buildout has a configuration model designed to support overriding. Often there is a base configuration that is overridden by specific configurations for development and deployment. It leverages the beautifully simple model of a dictionary of dictionaries provided by ConfigParser. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Wed Mar 7 12:01:12 2007 From: jim at zope.com (Jim Fulton) Date: Wed, 7 Mar 2007 06:01:12 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <45EDC772.3090803@simplistix.co.uk> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> Message-ID: <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> On Mar 6, 2007, at 2:56 PM, Chris Withers wrote: > Jim Fulton wrote: >> On Mar 3, 2007, at 11:27 PM, Chad Whitacre wrote: >> ... >>> Now, Jim: it looks like Zope still uses a Unix-y userland for >>> INSTANCE_HOME, yes? >> Yes, but I hate it. At Zope Corporation, We're moving away from >> it for a number of reasons. > > I actually like it a lot, still, and I haven't heard compelling > arguments, for me, for other things... > > The big plus point for me is that everything needed for one > deployment is in one folder. Having everything in one folder is great for development. It isn't so good for deployment, at least not on Unix. (I can think of lots of reasons why it wouldn't be great on Wndows either.) For example, site administrators like to keep log files together and separate from other files. Even if things are all together, there's really no point in having separate subdirectories, typically containing only one or 2 files, within the instance. In a development instance, I'd much rather have a single directory containing the few needed files directly. The only exception to this for me would be to have a subdirectory for Python modules, if you have instance specific Python modules. Having to look in subdirectories for configuration and log files is just a pain. ... >> For development, it adds structure that isn't needed. A Zope >> instance really only needs a few files. Trying to minic some >> notional unix layout just adds pointless structure. > > It's kindof self documenting though: > > /etc -> config > /bin -> scripts > /var -> data > /log -> logs > > I like that consistency, regardless of its origins... Bit without these, you have something like: zope.conf zopectl runzope debugzope scriptzope Data.fs zope.log It is pretty clear that zope.conf is a configuration file, zope.log is a log file, and that Data.fs. On Unix, It's pretty clear that the others are scripts, because they're executable and, on Windows, they should have .bat or .exe suffxes. >> The traditional complex Zope instance file layout lead to the use >> of an instance "skeleton" to deal with all of the files, which >> led, in turn, to a copy and hack style of configuration >> customization that is inflexible and encourages cruft. > > I think the Zope 3 skeletons went the wrong way. The skeletons > work, but where they only contain config that's specific to that > instance. Zope 3's notions of putting python scripts (and non- > trivial ones at that!) and the like into the instance home made me > shudder... I'm not sure if you are referring to more than scripts. I agree that we shouldn't have put utility scripts in instances. I would argue that only the ctl script should go in instances. The runzope, scriptzope, and debugzope scripts could be completely generic and invoked by an instance specific ctl script. This is what I do in my latest Zope 3 buildout recipes. Otherwise, Zope 2 and Zope 3 skeletons look pretty similar to me. >> For production deployments, we (Zope Corporation) install files >> into the *real* Unix tree where site administrators want them. > > Not everyone runs on unix. Having a standard layout that fits into > a folder works cross platform to a large extent. Only for a particular definition of "works". No experienced Unix administrator would say it works on Unix. I suspect that a professional Windows server adminstrator would have similar concerns. ... My original point was not to advocate a particular layout but to point out that different layouts will be needed in different situations and that mandating a particular layout was likely to cause problems. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ubernostrum at gmail.com Wed Mar 7 13:08:13 2007 From: ubernostrum at gmail.com (James Bennett) Date: Wed, 7 Mar 2007 06:08:13 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> Message-ID: <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com> On 3/7/07, Jim Fulton wrote: > Aside from the universal configuration file issue, I think this would > be a terrific thing for us to focus on. Something I hear a lot is > how much easier PHP applications are to deploy to hosting providers. > I would *love* it is Python had a similar story, even if only for > smaller applications. > > I'd love to get some input who know a lot about what makes deploying > PHP apps so easy. I've mostly been lurking because everybody here's quite a bit smarter than I am on most of the issues discussed, but in a past life I had a fair amount of experience working with and deploying PHP, so I'll throw in my $0.02. PHP is (or was, when I was doing it) "easy to deploy" largely because of two things: 1. mod_php. 2. Baked-in database libraries. Everybody already knows that web-server setup is a wart for Python (and the discussion on that lately has been encouraging), so I won't dwell on it except to say that I live for the day I'll be able to drop my Apache -> mod_proxy -> lighttpd -> Unix socket -> FastCGI -> WSGI -> Django setup (this on a "Python-friendly" shared host, no less) and have a server configuration that's simpler than the blog app it runs. The database issue is one that seems to get overlooked a bit, but is also a killer. PHP gives you SQLite and MySQL support for free, and Postgres is trivially easy to add if a host is offering Postgres databases. Meanwhile, most hosts are still with Python 2.3 or 2.4, so you don't even get SQLite out-of-the-box. The better ones will have appropriate DB modules installed anyway, but that still seems to be something of a crap shoot, and somebody who has to build their own copy of mysqldb to use Python on their hosting account is somebody who's not going to use Python on their hosting account. I'm hoping that the ongoing framework hype will help a lot with the database issue, though; a number of hosting companies right now seem to be waking up and realizing that there's a lot of money to be made from framework converts who need solid support for languages that aren't PHP. I'd say that if/when these two issues are overcome, or even made slightly less nasty to deal with, there's not really anything else PHP can compete on; WSGI and the ever-expanding range of kick-ass web tools Python offers blow PHP out of the water. To take an easy example, cruft-free URLs are still anywhere from tedious to nasty under PHP; you have to fiddle with mod_rewrite, and every PHP project has its own monolithic URL dispatch system. On the Python side, WSGI and tools like Paste Deploy make it trivially easy to hang any app anywhere you want it in your URL scheme. And setting aside actual technical issues, I also think there's room to work with documentation; going back to Jim's comment at the PyCon frameworks panel about documentation that tells stories, it's worth pointing out that a lot of the "PHP is easier" perception is largely just that -- a perception -- and that various languages and tools, PHP included, have compensated for some pretty nasty warts by telling compelling stories (Rails certainly wouldn't be where it is today if not for some great storytelling on the part of the people marketing it). I'm sure we have plenty of good stories we could tell, and I'm pretty sure we don't have as many warts :) -- "Bureaucrat Conrad, you are technically correct -- the best kind of correct." From zbynek.winkler at gmail.com Wed Mar 7 13:50:16 2007 From: zbynek.winkler at gmail.com (Zbynek Winkler) Date: Wed, 7 Mar 2007 13:50:16 +0100 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com> Message-ID: On 3/7/07, James Bennett wrote: > On 3/7/07, Jim Fulton wrote: > > Aside from the universal configuration file issue, I think this would > > be a terrific thing for us to focus on. Something I hear a lot is > > how much easier PHP applications are to deploy to hosting providers. > > I would *love* it is Python had a similar story, even if only for > > smaller applications. > > > > I'd love to get some input who know a lot about what makes deploying > > PHP apps so easy. > > I've mostly been lurking because everybody here's quite a bit smarter > than I am on most of the issues discussed, but in a past life I had a > fair amount of experience working with and deploying PHP, so I'll > throw in my $0.02. > > PHP is (or was, when I was doing it) "easy to deploy" largely because > of two things: > > 1. mod_php. > 2. Baked-in database libraries. And the fact of a really simple 'hello world' that just works. Python is dead simple for cmdline apps (print "hello world") but not for webapps. And the fact that deploying python app often consist of configuring the whole "everything" (if not building from source, or even finding on the web what exactly one needs for the particular situation) does not really help either. > Everybody already knows that web-server setup is a wart for Python > (and the discussion on that lately has been encouraging), so I won't > dwell on it except to say that I live for the day I'll be able to drop > my Apache -> mod_proxy -> lighttpd -> Unix socket -> FastCGI -> WSGI > -> Django setup (this on a "Python-friendly" shared host, no less) and > have a server configuration that's simpler than the blog app it runs. That is exactly what I meant :( Zbynek Winkler -- http://robotika.cz/ From sidnei at enfoldsystems.com Wed Mar 7 14:42:00 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Wed, 7 Mar 2007 10:42:00 -0300 Subject: [Web-SIG] daemon tools In-Reply-To: <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> Message-ID: On Windows, the NT Service Controller does all the dirty job. And it's pretty easy to write a service in Python that can run any application. The simplest Python service is shorter than 30 lines I think. Dealing with a service on Windows usually involves: - Registering/Unregistering the service - Setting service options - Startup type (automatic/manual/disabled) - Username (can be local machine or Active Directory, if the machine is on a domain) - Dependencies (a service can depend on other services) - Failure mode - There are 3 tries by default, you can customize what happens on each try - Ignore - Restart the Service - Run a program - Restart the computer The service, after being registered can be managed with standard tools present on the system: C:\src>net stop bthserv O servi?o de Bluetooth Support Service est? sendo finalizado . O servi?o de Bluetooth Support Service foi finalizado com ?xito. C:\src>net start bthserv O servi?o de Bluetooth Support Service est? sendo iniciado. O servi?o de Bluetooth Support Service foi iniciado com ?xito. You can also use command-line tools to query the service status: C:\src>sc \\pena queryex bthserv SERVICE_NAME: bthserv TYPE : 20 WIN32_SHARE_PROCESS STATE : 4 RUNNING (STOPPABLE,NOT_PAUSABLE,ACCEPTS_SHUTDOWN) WIN32_EXIT_CODE : 0 (0x0) SERVICE_EXIT_CODE : 0 (0x0) CHECKPOINT : 0x0 WAIT_HINT : 0x0 PID : 1372 FLAGS : C:\src>sc \\pena queryex xmlprov SERVICE_NAME: xmlprov TYPE : 20 WIN32_SHARE_PROCESS STATE : 1 STOPPED (NOT_STOPPABLE,NOT_PAUSABLE,IGNORES_SHUTDOWN) WIN32_EXIT_CODE : 1077 (0x435) SERVICE_EXIT_CODE : 0 (0x0) CHECKPOINT : 0x0 WAIT_HINT : 0x0 PID : 0 FLAGS : And that's just the tip of the iceberg. You can manage services on other machines for example, still from the command line. You can query service status with WMI, and you can interact with services from .NET. I would say that, thus, a service manager like 'zdaemon' it's not actually that useful on Windows *unless* it implements a Windows Service. In fact, I could see it being used as both a 'standalone service manager' and as a simple service with the NT Service Controller with little overlap, though I would highly discourage the former. There's some stuff from zdaemon that would be useful though, and do not work on Windows today due to some over-unixism in zdaemon, like an interactive prompt and script runner as 'zopectl debug' and 'zopectl run', I'm sure those two don't need to know about 'fork' or signals. What I'm really interested in is in how the service would communicate with the program being controlled. This is the painful part, and where I think we need to work together to make sure it works on Windows and on *nix platforms. You can surely count on me to discuss that part. As I mentioned on another thread, Zope uses 'signals' on *nix, and 'named events' on Windows, by means of the 'Signals' package in Zope. We could possibly re-use that. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From jtate at rpath.com Wed Mar 7 14:46:33 2007 From: jtate at rpath.com (Joseph Tate) Date: Wed, 7 Mar 2007 08:46:33 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> References: <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> Message-ID: <200703070846.33887.jtate@rpath.com> On Wednesday 07 March 2007 05:34:15 Jim Fulton wrote: > I'd love to get some input who know a lot about what makes deploying > PHP apps so easy. It's not the packaging format. Most php apps come down as a tarball. Extract it to your apache root, and you can connect to the app and do configuration, without even restarting apache (thanks to mod_php). I think the key thing is that configuring a "well written" php app is done through the web interface. No mucking with config files, no apache configuration required, etc. Just have to create a database and a user with permissions to it. If you'd like a specific example, I suggest trying to install gallery (http://gallery.menalto.com). There are sacrifices to make for this approach though: the app has to be able to write at least to its own config file, and to .htaccess. This means that security has to be super tight. Frequently the instructions are to chmod 777 the app's top level directory, configure, and then unchmod. Because so many things can be modified via .htaccess, including directory specific php settings, you rarely need further configuration. -- Joseph Tate Software Engineer rPath Inc. http://www.rpath.com/rbuilder/ (919) 851-3984 x2106 From rodsenra at gpr.com.br Wed Mar 7 15:34:14 2007 From: rodsenra at gpr.com.br (Rodrigo Senra) Date: Wed, 7 Mar 2007 11:34:14 -0300 Subject: [Web-SIG] daemon tools In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> Message-ID: <20070307113414.5ee7384a@Fenix> [ Sidnei da Silva ]: |The service, after being registered can be managed with standard tools |present on the system: | |C:\src>net stop bthserv # cut |C:\src>net start bthserv # cut |C:\src>sc \\pena queryex bthserv # cut |C:\src>sc \\pena queryex xmlprov # cut And, I am sure you are aware of that, the service can also be managed by Python through win32all: # random samples from a python service watchdog ;o) hscm = win32service.OpenSCManager(None, None, win32service.SC_MANAGER_ALL_ACCESS) hsvc = win32service.OpenService(hscm, service, win32service.SERVICE_ALL_ACCESS) status = win32service.QueryServiceStatus(hsvc) # code to test status and decide to restart it (or not) omitted win32service.StartService(hsvc,None) |I would say that, thus, a service manager like 'zdaemon' it's not |actually that useful on Windows *unless* it implements a Windows |Service. For symmetry's sake in Windows a Python service manager could simply use SCManager API under the hood (through win32all) to get the job done, still keeping a consistent cross-platform modus operandi. | In fact, I could see it being used as both a 'standalone |service manager' Do you mean a wrapper for native SCManager services ? |There's some stuff from zdaemon that would be useful though, and do |not work on Windows today due to some over-unixism in zdaemon, like an |interactive prompt and script runner as 'zopectl debug' and 'zopectl |run', I'm sure those two don't need to know about 'fork' or signals. | |What I'm really interested in is in how the service would communicate |with the program being controlled. This is the painful part, and where |I think we need to work together to make sure it works on Windows and |on *nix platforms. You can surely count on me to discuss that part. One naive suggestion would be to wrap Unix signals and Windows Event Objects under a single signaling abstraction. If what you meant by "communicate" can be restricted to flag-waving (and *not* some general data structure IPC), then these mechanisms should suffice. At least, I can say that Windows (manual reset) Event Objects are simple, robust (even in multi-threaded scenarios), and reasonably cross-platform from within the Windows family, IMHO. |As I mentioned on another thread, Zope uses 'signals' on *nix, and |'named events' on Windows, by means of the 'Signals' package in Zope. |We could possibly re-use that. Great, just checked that out. I think that is the way to go. Cheers, Senra ------------- Rodrigo Senra GPr Sistemas http://www.gpr.com.br From sidnei at enfoldsystems.com Wed Mar 7 16:44:17 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Wed, 7 Mar 2007 12:44:17 -0300 Subject: [Web-SIG] daemon tools In-Reply-To: <20070307113414.5ee7384a@Fenix> References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> <20070307113414.5ee7384a@Fenix> Message-ID: On 3/7/07, Rodrigo Senra wrote: > And, I am sure you are aware of that, the service can also be managed > by Python through win32all: > # snip Yeah, sorry. I thought that was pretty obvious, but I realize it wasn't *wink*. > For symmetry's sake in Windows a Python service manager could simply > use SCManager API under the hood (through win32all) to get the job done, > still keeping a consistent cross-platform modus operandi. Your suggestion is indeed quite appealling. I feel sad for not having thought of that before. zdaemon could be just a wrapper for SCManager and that is certainly the way to go. > |What I'm really interested in is in how the service would communicate > |with the program being controlled. This is the painful part, and where > |I think we need to work together to make sure it works on Windows and > |on *nix platforms. You can surely count on me to discuss that part. > > One naive suggestion would be to wrap Unix signals and Windows Event > Objects under a single signaling abstraction. If what you meant by > "communicate" can be restricted to flag-waving (and *not* some general > data structure IPC), then these mechanisms should suffice. Yes, in the case of Zope that's mainly abstracting SIGINT, SIGHUP, etc. > |As I mentioned on another thread, Zope uses 'signals' on *nix, and > |'named events' on Windows, by means of the 'Signals' package in Zope. > |We could possibly re-use that. > > Great, just checked that out. I think that is the way to go. I hope that others can agree too. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From fumanchu at amor.org Wed Mar 7 19:47:10 2007 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 7 Mar 2007 10:47:10 -0800 Subject: [Web-SIG] daemon tools In-Reply-To: <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8D2C0@ex9.hostedexchange.local> Jim Fulton wrote: > On Mar 5, 2007, at 1:38 PM, Robert Brewer wrote: > > ...where the "pywebd" module: > > > > 1. Composes the WSGI stack (provides a library to do so at least), > > 2. Notifies frameworks of site-wide events (like start, > stop, restart > > and graceful), > > 3. Provides plugins that frameworks can "notify"; for > example, adding > > files to an autoreload plugin. > > This sounds great to me. I wasn't expecting such quick agreement. ;) For anyone's information, I've started developing just such a beast in the CherryPy trunk: http://www.cherrypy.org/browser/trunk/cherrypy/pywebd CherryPy will probably continue to distribute it as a subpackage just for ease of install, but it won't have any CP dependencies. If others are really interested in developing this collaboratively, I'd be happy to make it its own project and solicit committers. In particular, there's no "webctl" module yet (because we need more discussion on its role before I commit to a direction). > I see a division of responsibilities between: > > * A facility for managing an application process > > - start/stop/status/etc > > - passing environment variables, providing some logging > support if necessary (especially for applications that > spew to standard err/out). > > - Optionally providing other daemon behaviors like > disconnecting from the controlling terminal, changing > user, etc. zdaemon provides this service on behalf of > applications. > > * A main program that provides common application-level > services like the ones you describe above. > > - Optionally providing other daemon behaviors like > disconnecting from the controlling terminal, changing > user, etc. ll.daemon provides some of these services > within an application. > > A question is whether to provide the daemonizing support in the main > program or in the controlling program. The "main program" should have the daemonization support. This would allow framework authors to continue providing "quickstart" and stop calls to their users as a full-featured alternative to invoking the controlling program (where "full-featured" includes daemonization, etcetera). IMO the controlling program ("webctl") wouldn't do any of your "optional daemon behaviors"; instead, it would be a command-line way to specify/collect an environment (including config files), start the main program, and then asynchronously send messages to the main program like "stop" and "status". It would run, execute a command, and then exit (much like apachectl does). This is also pretty much how I see zdctl operating, with a few areas I'd like to investigate: 1. I would very much like webctl to be the component that understands a WSGI-composition config format or formats. Or rather, I don't want pywebd to fuss with that--pywebd should understand the entry points and use/expose an API for composing a WSGI stack, but that should be an imperative API, so that frameworks can do their own composition for the user. For example, TG silently adds URL handlers for Mochikit (that shouldn't have to be included in a config file by the user). 2. AF_UNIX isn't available on Windows. I'd like to find ways of passing status back from pywebd to webctl that don't involve a socket. 3. zdctl spawns zdrun (right?). I'd like webctl to spawn pywebd, but currently I'm calling the whole package "pywebd". I probably need to change: /pywebd __init__.py base.py plugins.py win32.py ...to a more separated arrangement: /pyweb (other name ideas most welcome) __init__.py base.py plugins.py pywebd(.exe) unix.py webctl(.exe) win32.py > Note that in answering this question, we probably need to have an > idea how this will work on windows. If Unix-specific daemonizing > code is in the main application, then the application won't be > portable. Of course, if the main program is generic, it might not > be a big deal to have separate versions for Windows and Unix. My hope is that pywebd will have a "win32" module (as my initial foray does). Perhaps I should move the daemonization plugin to a "unix" (posix?) module. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From ianb at colorstudy.com Wed Mar 7 22:49:38 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 07 Mar 2007 15:49:38 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> Message-ID: <45EF3372.4020007@colorstudy.com> Jim Fulton wrote: >> A couple years back, I started writing a library to parse a more >> sophisticated, Python-like syntax to do the same sorts of things, >> but only got as far as the parser. > > A few years back, we created a library to parse more sophisticated > apache-like syntax and I wish we hadn't. The ini/config format is > pretty standard and, IMO, really quite adequate. I'm convinced that > we don't really need another configuration format, at least not at > this level. Details of the structure aside, I've found string:string dictionaries entirely sufficient for expressing every configuration I've wanted to do. I'm very happy that Paste Deploy doesn't support Python syntax for anything. >> that could get stdlib support and ultimately hosting company >> support. This would actually give us a leg up on even PHP for ease- >> of-deployment. > > Aside from the universal configuration file issue, I think this would > be a terrific thing for us to focus on. Something I hear a lot is > how much easier PHP applications are to deploy to hosting providers. > I would *love* it is Python had a similar story, even if only for > smaller applications. > > I'd love to get some input who know a lot about what makes deploying > PHP apps so easy. Well, it's a big help that PHP doesn't have Python's import system. Oh how I hate Python imports... anyway, since it just uses the filesystem everything is kind of naturally hierarchical and isolated. There are some system-wide configurations (in php.ini) -- these cause deployers a lot of pain. But they are mostly overridable with .htaccess, I think. Also there's not many libraries, and what libraries there are are typically shipped with the applications. PEAR (the PHP library system) started after I stopped doing much of any PHP, so I don't know how it effects things. PHP also gets a lot of benefit from a CGI-like execution model. There's a ton of crap that gets swept under the rug by this -- lots of memory leaks, for instance. As they've been building up larger frameworks built from PHP code, the CGI-like execution speed has also been hitting them. But since they have a fairly large library written in C (that is persistent and shared) it's usually pretty reasonable; it's just when they tried to copy Rails that it started really biting them. I think the database drivers are a bit of a red herring. What extensions PHP has been compiled with is pretty fixed by the hosting provider -- they just happen to all provide database drivers for the databases they support. Which is kind of a no-brainer; if they *cared* about Python they'd easily be able to do the same for Python. It would help if Python shipped with one or two, but eh. Anyway, my feelings are that it's: (a) simple hierarchy through the filesystem (which will make Chad all excited ;), (b) reliability of the CGI model, and (c) hosting providers give a damn. We can't do much about (c). (a) requires an isolation tool, but we have a few now. (b) still needs doing. That Python is theoretically faster than PHP due to its typical execution model doesn't mean much to hosting providers. They tend to be memory-constrained more than CPU constrained anyway. And if you have slow code, you personally suffer -- but if you use lots of memory, you make everyone suffer. One thing many hosts do is just periodically kill user's processes if they hang around too long. Most don't seem to care if you have long-running processes, though I've heard a few might disable your account. Someone (but I've forgotten who) suggested a technique to assist with this. The SCGI package has a script cgi2scgi, just a simple CGI script written in C that sends the request to another server; I think just a port, but I'm sure it could be extended easily enough to send it to a named socket. Anyway, if there was just a bit of process management code in that script it could also serve as a launcher, doing on-demand launching of a server (Flup I suppose) and then passing it on to that script. FastCGI does all these things, but setup can be fairly complicated and many implementations are buggy. Anyway, extending cgi2scgi to do this, along with some isolated environment, should be a fairly simple way to make Python hosting on commodity hosts a lot easier. Some of the hosts only give FTP access, and may not have a compiler. So ideally you could assemble everything on your workstation and upload it in batch. Probably a single Linux executable would be fine -- FreeBSD should be able to run it fine, and everything that matters (for this use case) is Linux or FreeBSD. Hopefully Sidnei won't mind that we leave Windows out ;) -- commodity Windows hosting is another situation entirely (about which I know nothing). -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From eucci.group at gmail.com Wed Mar 7 23:55:10 2007 From: eucci.group at gmail.com (Jeff Shell) Date: Wed, 7 Mar 2007 15:55:10 -0700 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> Message-ID: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> On 3/7/07, Jim Fulton wrote: > > On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote: > ... > > Personally, I don't care for the Paste Deploy syntax -- frankly > > it's almost barbaric. :) > > I don't mean to pick on you, but I really *hate* comments like this. > I don't like softer forms like "complicated" or even "makes me > uneasy". It would be far more helpful if you provides specific > criticism. I'd appreciate it if we would all just ignore statements > like this and, preferably, stop making them. I agree. A problem I have is that I see these files with their syntax and I balk. I don't think it's the syntax that's at issue as much as it is that there's now a new set of terms that I don't understand. 'Entry Point' is one that that shorted out my brain for a long time whenever I'd try to look at the Paste docs to figure out what Paste was. I think I hold Python to a different standard as I want to know what something is doing. I don't think about this when I configure Apache. I just know that very few of my Zope 3 terms map to Paste terms, and all of this talk of 'filters' and 'entry points' and the like... I look at it and go "huh, interesting." And then it's back to work on my own thing. ... > > A couple years back, I started writing a library to parse a more > > sophisticated, Python-like syntax to do the same sorts of things, > > but only got as far as the parser. > > A few years back, we created a library to parse more sophisticated > apache-like syntax and I wish we hadn't. The ini/config format is > pretty standard and, IMO, really quite adequate. I'm convinced that > we don't really need another configuration format, at least not at > this level. While we're all talking about what we did or did not make, I found that I wanted a lot more direct control than zc.buildout gave me. After growing frustrated with writing Recipes and having to mentally manage the glue between a config file that was like a make file (it makes a lot of things) but not like a Rake file (no ability to include my own programming logic within the buildout spec, only in recipes), I took inspiration from Rake (a Ruby tool) and wrote a tool that looks for `Rockfile`, which is basically a Python file (no .py extension so as to avoid accidental imports). I still don't *really* understand Eggs, nor how to get them to work easily within individual Zope 3 instances. None of our existing Zope 3 libraries / apps are written as eggs or even as distutils-installable packages. We just check our packages directly out of CVS, and typically just check out other libraries from their repositories as well. We dump them right in $INSTANCE_HOME/lib/python (a layout that I actually like) and can then rest assured that a newly deployed app's need/use of SQLAlchemy 0.3.4 doesn't interfere with an already running app's need/use of SQLAlchemy 0.2.8. A further benefit of having the Rockfile system is that they can be used for other tasks done during development, such as updating MochiKit, generating a special 'NoExport.js' file, and then packing a few different combinations of MochiKit together. from rocketbuild.api import * from string import Template ns = namespace('mochikit') ROCKFILEPATH = globals().get('ROCKFILEPATH', path('.')) MOCHIKIT_LIB = ROCKFILEPATH/'libs'/'mochikit' MOCHIKIT_DL = ROCKFILEPATH/'mochikit_dl' MOCHIKIT_SRC = MOCHIKIT_DL/'MochiKit' SCRATCH = MOCHIKIT_LIB/'_scratch.js' CLEANUP = [MOCHIKIT_DL] NOEXPORT = Template("""\ /* * Built for MochiKit SVN Checkout ${revision} */ var MochiKit = { __export__: false }; """) @ns.task('get') def getmochikit(): if MOCHIKIT_DL.exists() and bool(MOCHIKIT_DL.listdir()): return svn = Subversion('http://svn.mochikit.com/mochikit') svn.co('trunk', target=MOCHIKIT_DL) @ns.task('clearmochilib') def clearmochilib(): for jscript in MOCHIKIT_LIB.files('*.js'): jscript.remove() @ns.task('make-noexport') def makenoexport(): info = Subversion().info(MOCHIKIT_DL) src = NOEXPORT.safe_substitute(**info) file(MOCHIKIT_LIB/'NoExport.js','w').write(src) @ns.task('build', ['get', 'clearmochilib', 'make-noexport']) def mochi_install(): for source in MOCHIKIT_SRC.files('*.js'): log.info('copy %s -> %s' % (source, MOCHIKIT_LIB)) source.copy(MOCHIKIT_LIB) @task('clear') def clear(): for p in filter(path.exists, paths(*CLEANUP)): log.info('rmtree: %s', p.name) p.rmtree() if SCRATCH.exists(): SCRATCH.remove() I guess I'm just a control freak. It was too hard to control Buildout to build out something that matches the way we've worked for years; it was easier to write a tool from scratch. Which I think is the Python way, for better or worse. Anyways, this is the tool that we're starting to use at Bottlerocket to automate our deployments as they grow more complex. > ... > > > Anyway, all that aside, I think it would be fantastic if we could > > come up with some "universal file format" for single-file > > configuration and deployment of applications (including auto- > > install of all needed eggs), Configuration and deployment? I'm trying to understand the scope of these terms (or this combined term) better. I take it 'configuration' means just how an 'app' might publish itself to a WSGI server. Is that right? For us, deployment now is: 1. Make a Zope 3 instance home ('appserv1') 2. `cd appserv1/lib/python; cvs checkout customerapp` 3. `rockout -vv customerapp/Rockfile install` (installs dependencies, mostly by CVS / Subversion checkout, usually directly into `appserv1/lib/python`) 4. `cd ../../etc` (back to 'appserv1/etc') 5. choose a port number in zope.conf (the zope/twisted server config) 6. add two lines to Zope 3's `site.zcml` to set up our app: The first line is a single file that sets up all of the dependencies and includes them in the proper order (probably only of interest, maybe, to other Zope 3 people). Basically this is my startup for my application within the Zope 3 application. The second line refers to configuration settings for machine local resources (database connections, cached resource directories, and so on). This may be written at deployment time. We keep it within the app so that it stays under source control, and also lets us know the names of services on which we may depend. This is also Zope 3 specific. I don't know of any way in which a configuration tool could be generic enough to handle any of this - even something as generic as a dburi string - unless it was restricted to handling ONLY basic values. 7. add site info to apache (rewriterule(s) / proxy). Is this analogous to the deployment and configuration being discussed? Or is the desired outcome really one where I hand someone a tarball and/or config file/script which would bring in (or have) ALL of the Zope 3 framework along with my application and its dependents, ALL in a way that doesn't trample on anyone/anything else (completely self contained), and that someone can then add a line or two to the web server's config file (if even that) and it all just runs? I guess Jim may be the only one with the Zope 3 knowledge to answer this. ... > > that could get stdlib support and ultimately hosting company > > support. This would actually give us a leg up on even PHP for ease- > > of-deployment. > > Aside from the universal configuration file issue, I think this would > be a terrific thing for us to focus on. Something I hear a lot is > how much easier PHP applications are to deploy to hosting providers. > I would *love* it is Python had a similar story, even if only for > smaller applications. > > I'd love to get some input who know a lot about what makes deploying > PHP apps so easy. I believe it's been said already that many PHP apps can just be un-tarred/gzipped. Plus, PHP has the benefit of being basically built in to Apache. Most hosting providers can enable PHP for individual accounts in a snap. So in many cases, deploying a PHP app is seldom any harder than deploying a static web site. Granted, there are more advanced applications, and I don't know how they get packaged or installed. Perhaps PHP is an unfair case to look at: it's built in, and isn't terribly complex. It's an easy processor directive. A Pylons, Turbogears, or Zope 3 'app' isn't a bunch of .psp files that are executed automatically by Apache. A more fair case to look at is Java application deployment - maybe. I have no experience (yay!) with this. I'm still a bit confused by the "write with any framework, deploy on any server" line I've heard from the Servlet/J2EE world. I think I've always considered it all to be one and the same, coming from my long history with Zope, I've thought "if I program against Zope, I serve from Zope." But in theory, since Zope 3 has `zope.app.wsgi`, I could serve from... anything? I guess that since I don't think about serving via Twisted any more than I thought about serving via ZServer, I could put CherryPy, mod_wsgi, whatever else underneath, right? Sorry if that's a lot of questions. I'm still trying to grasp everything. -- Jeff Shell From graham.dumpleton at gmail.com Thu Mar 8 00:04:44 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Thu, 8 Mar 2007 10:04:44 +1100 Subject: [Web-SIG] WSGI server/adapter and sys.exit()/SystemExit exception. Message-ID: <88e286470703071504u2459bcb6r3969cff7fe2a06d5@mail.gmail.com> Since discussion is moving towards look at defining responsibilities of the container or environment that a WSGI application runs in, thought it would be a good time to ask this question. The question is, if a WSGI application calls sys.exit() or raises a SystemExit exception explicitly, what action if any should a WSGI server/adapter take in response. Should it allow the process to be shutdown, or should it ignore it. If the WSGI server/adapter doesn't ignore it, then you run the risk of a WSGI application shutting down your whole web server if everything runs within the one process. In the case of a web server where applications run in multiple spawned child process, eg Apache, then you only affect the one process that the request was handled within. Even so, in the case of Apache, if the worker MPM was being used and thus there could be requests being handled in parallel in the same process, maybe not even as Python requests, but static file requests, PHP requests, CGI etc, then these other requests would still be affected by the process being killed. Thus to my mind any WSGI server/adapter should possibly always ignore a SystemExit exception coming from with an executing WSGI application. One though also has to worry about SystemExit exceptions raised as a side effect of a Python import performed to load a WSGI application. Then you potentially have the issue of SystemExit exception raised from thread spawned by WSGI application. What are other peoples thoughts on this. Should one try and protect the container application from abuse of SystemExit by a WSGI application or should one simply trust the application writer? In a shared web hosting environment can someone ever trust an application writer in this way though? Comments? Graham From fumanchu at amor.org Thu Mar 8 00:13:20 2007 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 7 Mar 2007 15:13:20 -0800 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local> Jeff Shell wrote: > Configuration and deployment? > > I'm trying to understand the scope of these terms (or this combined > term) better. I take it 'configuration' means just how an 'app' might > publish itself to a WSGI server. Is that right? > > For us, deployment now is: > > 1. Make a Zope 3 instance home ('appserv1') > 2. `cd appserv1/lib/python; cvs checkout customerapp` > 3. `rockout -vv customerapp/Rockfile install` (installs > dependencies, mostly > by CVS / Subversion checkout, usually directly into > `appserv1/lib/python`) > 4. `cd ../../etc` (back to 'appserv1/etc') > 5. choose a port number in zope.conf (the zope/twisted server config) > 6. add two lines to Zope 3's `site.zcml` to set up our app > 7. add site info to apache (rewriterule(s) / proxy). > > Is this analogous to the deployment and configuration being discussed? Yes, although I want to make sure we keep discussion of 'site installation' very separate from 'website composition' (where you already have all the pieces and just need to declare where they are and how they map to URL's). IMO site installation is a 3 to 5-year project; website composition is a one-year project that shouldn't get bogged down in the former. > But in theory, since Zope 3 has `zope.app.wsgi`, I could serve from... > anything? I guess that since I don't think about serving via Twisted > any more than I thought about serving via ZServer, I could put > CherryPy, mod_wsgi, whatever else underneath, right? In theory, yes. For example, you should be able to put CherryPy's WSGI server underneath. Most of the rest of CherryPy (the app framework bits) are not directly *connectable* to the rest of Zope, but one of the dreams of WSGI is that you could *compose* a site using apps from multiple frameworks. See the diagram at the bottom of http://www.cherrypy.org/wiki/WSGI for example, which shows all of the places you can connect foreign WSGI components with CherryPy WSGI components. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From fumanchu at amor.org Thu Mar 8 00:25:07 2007 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 7 Mar 2007 15:25:07 -0800 Subject: [Web-SIG] WSGI server/adapter and sys.exit()/SystemExit exception. In-Reply-To: <88e286470703071504u2459bcb6r3969cff7fe2a06d5@mail.gmail.com> Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8D969@ex9.hostedexchange.local> Graham Dumpleton wrote: > The question is, if a WSGI application calls sys.exit() or raises a > SystemExit exception explicitly, what action if any should a WSGI > server/adapter take in response. Should it allow the process to be > shutdown, or should it ignore it. > > ...to my mind any WSGI server/adapter should possibly always ignore > a SystemExit exception coming from with an executing WSGI application. > One though also has to worry about SystemExit exceptions raised as a > side effect of a Python import performed to load a WSGI application. > Then you potentially have the issue of SystemExit exception raised > from thread spawned by WSGI application. For now, I'd try to give deployers using my tools direct control over whether an application is allowed to stop the process or not via SystemExit. In a future pywebd/webctl world, I'd like to see process shutdown/restart delegated to plugins only, which can then be attached/detached by deployers. For example, the current pywebd autoreload plugin can call os.execv; if you're deploying with mod_python that autoreload plugin is simply never attached and therefore cannot call execv. If deploying by calling a hypothetical 'webctl' script, I would expect a command-line arg or config entry which controlled whether or not to plug in the autoreloader. Finally, when deploying from CherryPy itself, it plugs in (and configures) the autoreloader based on the existing CherryPy config semantics. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From ianb at colorstudy.com Thu Mar 8 00:35:35 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 07 Mar 2007 17:35:35 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> Message-ID: <45EF4C47.8060700@colorstudy.com> Jeff Shell wrote: > But in theory, since Zope 3 has `zope.app.wsgi`, I could serve from... > anything? I guess that since I don't think about serving via Twisted > any more than I thought about serving via ZServer, I could put > CherryPy, mod_wsgi, whatever else underneath, right? In theory you can set up Zope 3 using something like: [app:main] paste.app_factory = some_function_yet_to_be_written I thought zope.paste did this, but it's a little wonky now that I look at it. It seems to basically read INSTANCE_HOME and create a single Zope WSGI app, and then kind of minimally plug into it. That function would more ideally look like: from zope.app.wsgi import getWSGIApplication def make_zope_app(global_conf, instance_home=None, configfile=None): if configfile is None: configfile = global_conf.get('configfile') if configfile is None: if instance_home is None: instance_home = global_conf.get('instance_home') if not instance_home: raise ValueError( 'You must give a configfile or instance_home value') configfile = os.path.join(instance_home, 'etc', 'zope.conf') app = getWSGIApplication(configfile) return app Then in Zope's setup.py: setup(... entry_points=""" [paste.app_factory] main = zope.some_module:make_zope_app """) Then you'd configure it like: [app:main] use = egg:Zope # Same directory as the config file: instance_home = %(here)s # instead of "use", and if you didn't set up the entry point: paste.app_factory = zope.some_module.make_zope_app And you'd set up a server like: [app:main] # CherryPy doesn't natively provide this entry point... use = egg:PasteScript#cherrypy # or... #use = egg:Paste#http, egg:Flup#scgi, etc host = 0.0.0.0 port = 8080 Put both those sections in one file (say, deploy.ini) and then do: $ paster serve deploy.ini And it'll start up. Additionally, instead of plugging that app directly into a server, you could wrap it with different kinds of middleware, which is where it starts looking a bit more interesting. For instance, for Paste's interactive debugger: [app:main] use = egg:Zope ... filter-with = egg:Paste#evalerror Though that probably won't quite work, because we don't all agree on a way to indicate to the app that it shouldn't catch unexpected errors (Zope uses environ['wsgi.handleErrors']); which is incidentally what this proposed spec would help us agree on: http://wsgi.org/wsgi/Specifications/throw_errors -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From eucci.group at gmail.com Thu Mar 8 00:58:20 2007 From: eucci.group at gmail.com (Jeff Shell) Date: Wed, 7 Mar 2007 16:58:20 -0700 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local> References: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local> Message-ID: <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com> On 3/7/07, Robert Brewer wrote: > Jeff Shell wrote: > > Configuration and deployment? > > > > I'm trying to understand the scope of these terms (or this combined > > term) better. I take it 'configuration' means just how an 'app' might > > publish itself to a WSGI server. Is that right? > > > > For us, deployment now is: > > > > 1. Make a Zope 3 instance home ('appserv1') > > 2. `cd appserv1/lib/python; cvs checkout customerapp` > > 3. `rockout -vv customerapp/Rockfile install` (installs > > dependencies, mostly > > by CVS / Subversion checkout, usually directly into > > `appserv1/lib/python`) > > 4. `cd ../../etc` (back to 'appserv1/etc') > > 5. choose a port number in zope.conf (the zope/twisted server config) > > 6. add two lines to Zope 3's `site.zcml` to set up our app > > 7. add site info to apache (rewriterule(s) / proxy). > > > > Is this analogous to the deployment and configuration being discussed? > > Yes, although I want to make sure we keep discussion of 'site > installation' very separate from 'website composition' (where you > already have all the pieces and just need to declare where they are and > how they map to URL's). IMO site installation is a 3 to 5-year project; > website composition is a one-year project that shouldn't get bogged down > in the former. Could you elaborate more on these terms? To whom do the spans 'one year project' and '3 to 5 year project' apply? Often we have web apps, written in Zope 3, that are really two or more web apps. Like an 'admin' side and 'public' side, typically handled via different skins/views. Apache rewrite rules basically handle that routing. So in my mind, if I deploy our CMS, I have the following URL maps: http://example.com/admin/(.*) => examplesite/++skin++CMSAdmin/$1 http://example.com/(.*) => examplesite/++skin++ExamplePublic/$1 Same Zope application, with just a couple of different settings based on the incoming URL, and then Zope and our app handles the rest of the URL. Is that a site installation? Two site installations? Or two examples of website composition? Again, I'm just trying to understand the terminology and map it to the way I'm used to working, and I think of the above as 'site installation'. The other tried and true example I can think of is when a customer asks "uhm, and can we have a forum with that?" I guess website composition might include the above two URL maps, plus one for: http://example.com/forum/(.*) => SuperTerrificPylonsWebForumWSGI But should this be the provence of WSGI? With Apache rewrite rules, if I was doing such a blunt grafting of 'forum' onto my customer's site, I could just as easily use phpBB. Then I'm not limiting myself to Python if I feel there's a better suited tool for a particular task. I brought up this forum example because it's something we've run into a couple of times and may be about to encounter again. Depending on customer needs and wants, one of our thoughts is to just drop in some PHP bulletin board or some other feature complete app. So if SuperTerrificPylonsWebForumWSGI is basically a black box - I configure its colors, templates, etc, but expect no other integration with the customer's main site / CMS - what benefits might I get from composing via WSGI? -- Jeff Shell From ianb at colorstudy.com Thu Mar 8 01:13:22 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 07 Mar 2007 18:13:22 -0600 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com> References: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local> <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com> Message-ID: <45EF5522.8000004@colorstudy.com> Jeff Shell wrote: > Often we have web apps, written in Zope 3, that are really two or more > web apps. Like an 'admin' side and 'public' side, typically handled > via different skins/views. Apache rewrite rules basically handle that > routing. So in my mind, if I deploy our CMS, I have the following URL > maps: > > http://example.com/admin/(.*) => examplesite/++skin++CMSAdmin/$1 > http://example.com/(.*) => examplesite/++skin++ExamplePublic/$1 > > Same Zope application, with just a couple of different settings based > on the incoming URL, and then Zope and our app handles the rest of the > URL. > > Is that a site installation? Two site installations? Or two examples > of website composition? Again, I'm just trying to understand the > terminology and map it to the way I'm used to working, and I think of > the above as 'site installation'. > > The other tried and true example I can think of is when a customer > asks "uhm, and can we have a forum with that?" I guess website > composition might include the above two URL maps, plus one for: > > http://example.com/forum/(.*) => SuperTerrificPylonsWebForumWSGI > > But should this be the provence of WSGI? With Apache rewrite rules, if > I was doing such a blunt grafting of 'forum' onto my customer's site, > I could just as easily use phpBB. Then I'm not limiting myself to > Python if I feel there's a better suited tool for a particular task. > > I brought up this forum example because it's something we've run into > a couple of times and may be about to encounter again. Depending on > customer needs and wants, one of our thoughts is to just drop in some > PHP bulletin board or some other feature complete app. > > So if SuperTerrificPylonsWebForumWSGI is basically a black box - I > configure its colors, templates, etc, but expect no other integration > with the customer's main site / CMS - what benefits might I get from > composing via WSGI? Well, here's how you might do it in Paste Deploy: [composite:main] use = egg:Paste#urlmap / = cms /admin = admin_cms /forum = forum /forum_phpBB = forum_phpBB [app:cms] use = Zope instance_home = %(here)s/zope root_object = examplesite default_view = ExamplePublic [app:admin_cms] use = cms default_view = CMSAdmin [app:forum] use = egg:SuperTerrificPylonsWebFormWSGI database = mysql://localhost/form_db [app:form_phpBB] use = egg:wphp base_dir = %(here)s/phpBB But then lets say you want all these pieces to look similar: [composite:main] ... /_theme_files = theme_files filter-with = deliverance [app:theme_files] use = egg:Paste#static document_root = %(here)s/theme_files [filter:deliverance] use = egg:Deliverance theme_uri = /_theme_files/blank_theme.html rule_uri = /_theme_files/rules.xml And then all the content, regardless of its source (could be PHP, piped in via HTTP, or static files) gets piped through Deliverance which wraps them all in the same outer theme. An even more common use would be to wrap everything in an authentication middleware that sets REMOTE_USER, something that can even be used by PHP apps (at least some PHP apps, like WordPress, make using this kind of authentication pretty easy). You can mostly do all this stuff via passing HTTP around, and I actually really like the ability to easily do HTTP requests based on a WSGI request, but it's a lot easier to exchange request information in WSGI than HTTP by itself. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From fumanchu at amor.org Thu Mar 8 01:21:35 2007 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 7 Mar 2007 16:21:35 -0800 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com> Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8DA77@ex9.hostedexchange.local> Jeff Shell wrote: > On 3/7/07, Robert Brewer wrote: > > Jeff Shell wrote: > > > Configuration and deployment? > > > > > > I'm trying to understand the scope of these terms (or > this combined > > > term) better. I take it 'configuration' means just how an > 'app' might > > > publish itself to a WSGI server. Is that right? > > > > > > For us, deployment now is: > > > > > > 1. Make a Zope 3 instance home ('appserv1') > > > 2. `cd appserv1/lib/python; cvs checkout customerapp` > > > 3. `rockout -vv customerapp/Rockfile install` (installs > > > dependencies, mostly > > > by CVS / Subversion checkout, usually directly into > > > `appserv1/lib/python`) > > > 4. `cd ../../etc` (back to 'appserv1/etc') > > > 5. choose a port number in zope.conf (the zope/twisted > server config) > > > 6. add two lines to Zope 3's `site.zcml` to set up our app > > > 7. add site info to apache (rewriterule(s) / proxy). > > > > > > Is this analogous to the deployment and configuration > being discussed? > > > > Yes, although I want to make sure we keep discussion of 'site > > installation' very separate from 'website composition' (where you > > already have all the pieces and just need to declare where > they are and > > how they map to URL's). IMO site installation is a 3 to > 5-year project; > > website composition is a one-year project that shouldn't > get bogged down > > in the former. > > Could you elaborate more on these terms? To whom do the spans 'one > year project' and '3 to 5 year project' apply? I meant those terms to apply to web-sig and any work we do on this list to produce specs, libraries, or tools to address such domains in a common fashion. That is, I think it would take 3 to 5 years for web-sig to produce a 'site installation' tool (although leveraging setuptools could be part of this timeframe), but only a year to produce a initial, reasonable spec or tool for composing and controlling websites built with WSGI components. > Often we have web apps, written in Zope 3, that are really two or more > web apps. Like an 'admin' side and 'public' side, typically handled > via different skins/views. Apache rewrite rules basically handle that > routing. So in my mind, if I deploy our CMS, I have the following URL > maps: > > http://example.com/admin/(.*) => examplesite/++skin++CMSAdmin/$1 > http://example.com/(.*) => examplesite/++skin++ExamplePublic/$1 > > Same Zope application, with just a couple of different settings based > on the incoming URL, and then Zope and our app handles the rest of the > URL. > > Is that a site installation? Two site installations? Or two examples > of website composition? Again, I'm just trying to understand the > terminology and map it to the way I'm used to working, and I think of > the above as 'site installation'. In my book, that would be one site, two apps (and in this example, one framework). And I never use the word "installation" to describe the site; to me it's always used as an adjective (as in the phrase "installation process"). > The other tried and true example I can think of is when a customer > asks "uhm, and can we have a forum with that?" I guess website > composition might include the above two URL maps, plus one for: > > http://example.com/forum/(.*) => SuperTerrificPylonsWebForumWSGI > > But should this be the provence of WSGI? It's reasonable to ask for that, IMO. Many people are already using WSGI to do just that sort of mixing. The issue we're discussing is that there are currently several ways to declare/compose a stack of WSGI components, and we'd like to see if we can standardize. > With Apache rewrite rules, if > I was doing such a blunt grafting of 'forum' onto my customer's site, > I could just as easily use phpBB. Then I'm not limiting myself to > Python if I feel there's a better suited tool for a particular task. > > I brought up this forum example because it's something we've run into > a couple of times and may be about to encounter again. Depending on > customer needs and wants, one of our thoughts is to just drop in some > PHP bulletin board or some other feature complete app. http://www.google.com/search?q=wsgi+php Robert Brewer System Architect Amor Ministries fumanchu at amor.org From chad at zetaweb.com Thu Mar 8 04:36:11 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Wed, 07 Mar 2007 22:36:11 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com> Message-ID: <45EF84AB.3040807@zetaweb.com> James, Thanks for weighing in. >> I'd love to get some input who know a lot about what makes >> deploying PHP apps so easy. > > In a past life I had a fair amount of experience working with > and deploying PHP, so I'll throw in my $0.02. > > It's worth pointing out that a lot of the "PHP is easier" > perception is largely just that -- a perception. I don't have tons of PHP experience, but I did just finish working on a pretty sizable job, and the deployment was anything but easy. Instead it was a brittle amalgam of XML, Apache conf, and nasty PHP abstractions. My impression is that PHP is easy for simple cases (unpack WordPress and go), but quickly gets ugly when you start dealing with frameworks. So maybe Python is the opposite? Harder for the simple cases, but more elegant in the more complicated scenarios. chad From chad at zetaweb.com Thu Mar 8 04:45:53 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Wed, 07 Mar 2007 22:45:53 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45EF3372.4020007@colorstudy.com> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EF3372.4020007@colorstudy.com> Message-ID: <45EF86F1.4050709@zetaweb.com> > Anyway, my feelings are that it's: (a) simple hierarchy through > the filesystem (which will make Chad all excited ;) BLAM!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! From sidnei at enfoldsystems.com Thu Mar 8 04:46:05 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Thu, 8 Mar 2007 00:46:05 -0300 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45EF4C47.8060700@colorstudy.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com> <45EF4C47.8060700@colorstudy.com> Message-ID: On 3/7/07, Ian Bicking wrote: > In theory you can set up Zope 3 using something like: > > [app:main] > paste.app_factory = some_function_yet_to_be_written > > I thought zope.paste did this, but it's a little wonky now that I look > at it. Well, you're probably missing something then, from [1]: """ [app:Paste.Main] paste.app_factory = zope.paste.application:zope_publisher_app_factory """ > It seems to basically read INSTANCE_HOME and create a single > Zope WSGI app, and then kind of minimally plug into it. It's actually a mixed bag. It looks for INSTANCE_HOME to know where to find paste.ini. The second thing it does is to help register an IServerType factory so that you can actually run the WSGI app created with the included-in-zope3 Twisted WSGI server. I don't recall if it runs with ZServer too, probably does. So, to some extent, it wasn't meant to make Zope 3 a WSGI that can be run anywhere, it was just meant to make it possible to use paste to compose 'a' WSGI app that uses zope.app.publication that could be run by Twisted or ZServer. Now, I'm not saying that it can't evolve into something that makes Zope 3 run as a WSGI anywhere. It just wasn't the original intent. [1] http://awkly.org/2006/01/25/zopepaste-wsgi-applications-in-zope-3-using-pastedeploy/ -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From chad at zetaweb.com Thu Mar 8 05:55:37 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Wed, 07 Mar 2007 23:55:37 -0500 Subject: [Web-SIG] windows, pywebd, webctl Message-ID: <45EF9749.3070002@zetaweb.com> All, Windows ======= Sidnei, et al.: your points are well-taken and your expertise appreciated. Thanks! pywebd ====== Bob: I'm on board with your vision for a common server library here. Count me in. webctl/filesystem layout/config syntax ====================================== This is looking less hopeful as a place to collaborate: - An executable needs a config file on the command line, and/or a config file in a pre-determined place. - *Requiring* a config file on the command line is butt-ugly. - Our opinions regarding filesystem layout seem to be, um, non-overlapping. I'd like to venture one more round on this, however, before giving up on it: - It might be the case that Zope only has a few files in an INSTANCE_HOME, but I find myself putting quite a bit in a site's userland: - I'll install Python packages in there wholesale, so I get their scripts in bin/, and lots of modules in lib/python. - I have multiple configuration files in etc/ (as discussed) along with templates in etc/templates/. - I put documentation in doc/. - I have site-specific utility or cron scripts in bin/. - I have extra data files in var/. Keeping it all in svn means that a website is very nearly self-contained and isolated, requiring not much besides Python to be installed in the base system. This is great for many-sites-on-one-server. - For one-site-on-many-servers, why does a Unix-y userland for development conflict with deployment? That is, why can't a development userland simply be installed into /usr/local for deployment? Surely logging differences could be handled in configuration, no? - Besides, my proposal only specified two requirements: etc/.conf lib/python Is there really a Unix sysadmin that would balk at this? This is all that's really needed for a common executable to get your site online. Lay out the rest however you want. - Jim, you hold particular distain for lib/python, but it's probably the best example of my "standards enable tools to evolve" argument: lib/python buys you distutils, setuptools, easy_install, workingenv, etc. - This same principle makes sense of runzope, scriptzope, and debugzope: standardize the file format (= fs layout), and such tools fit perfectly in /usr/local/bin. - Almost all of the Windows discussion has centered on daemons vs. services. Sidnei, et al.: what does a "native" Windows filesystem layout look like for a web application? Is using a self-contained Unix-inspired layout faux pas? - As mentioned wrt PHP, users like familiar filesystem layouts. Reaching agreement here improves our story for newcomers. A common executable (= common fs layout) may very well be pushing the limits of collaboration too far, but I'll feel better about admitting that if we pursue the conversation a bit further. chad From sidnei at enfoldsystems.com Thu Mar 8 06:09:47 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Thu, 8 Mar 2007 02:09:47 -0300 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <45EF9749.3070002@zetaweb.com> References: <45EF9749.3070002@zetaweb.com> Message-ID: On 3/8/07, Chad Whitacre wrote: > - Almost all of the Windows discussion has centered on daemons > vs. services. Sidnei, et al.: what does a "native" Windows > filesystem layout look like for a web application? Is using a > self-contained Unix-inspired layout faux pas? It depends on what you consider a 'web application': - If it's ASP or ASP.NET, it's just a bunch of files dropped in a directory, just like PHP. It usually has it's configuration in a 'web.config' or similar in the same directory. - But typically (well, before IIS 7) a 'web application' was recommended to be implemented as an ISAPI Extension. That's basically a DLL that you register through the IIS Management Console. I could envision an ISAPI Extension that you register for some file extension (or for '*') and that basically delegates to Paste. Oh, hey, that sounds like ISAPI WSGI [1][2]. [1] http://code.google.com/p/isapi-wsgi/wiki/ISAPISimpleHandlerDocs [2] http://pylonshq.com/project/pylonshq/wiki/ServePylonsWithIIS -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From jim at zope.com Thu Mar 8 12:45:18 2007 From: jim at zope.com (Jim Fulton) Date: Thu, 8 Mar 2007 06:45:18 -0500 Subject: [Web-SIG] daemon tools In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> Message-ID: <9783FC39-D805-4CDB-BB7A-D68BE963567C@zope.com> On Mar 7, 2007, at 8:42 AM, Sidnei da Silva wrote: > On Windows, the NT Service Controller does all the dirty job. And it's > pretty easy to write a service in Python that can run any application. > The simplest Python service is shorter than 30 lines I think. Would such a controller: - Invoke the application as a subprocess, or - Be part pf the application. (For example, would it be more like ll.daemon or zdaemon?) ... > There's some stuff from zdaemon that would be useful though, and do > not work on Windows today due to some over-unixism in zdaemon, like an > interactive prompt and script runner as 'zopectl debug' and 'zopectl > run', I'm sure those two don't need to know about 'fork' or signals. Note that the scope of zdaemon, as it's name implies was always limited to Unix. If it is reasonable to do so, I'd be happy to see s single tool that handled both cases, although if there is a choice between a single tool that handled both cases adequately and 2 tools that handled both cases well, I'd pick the later. Also note that the "script runner" feature you mention isn't part of zdaemon. zdaemon has a subclassing interface, which is currently undocumented, that Zope uses to add the "run" and "debug" commands. These are Zope specific. As I mentioned earlier, I'd personally be happy to see the shell mode go. > What I'm really interested in is in how the service would communicate > with the program being controlled. This is the painful part, and where > I think we need to work together to make sure it works on Windows and > on *nix platforms. You can surely count on me to discuss that part. I think an event model, as Robert Brewer described is a good start. > As I mentioned on another thread, Zope uses 'signals' on *nix, and > 'named events' on Windows, by means of the 'Signals' package in Zope. I'm not familiar with that. :) So that unifies Unix signals and windows events? Interesting. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From chris at simplistix.co.uk Thu Mar 8 10:36:48 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 08 Mar 2007 09:36:48 +0000 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> Message-ID: <45EFD930.1040406@simplistix.co.uk> Jim Fulton wrote: > > Having everything in one folder is great for development. It isn't so > good for deployment, at least not on Unix. Can you explain why? I do a lot of unix deployment, and the thought of a buildout that sprays files all over the system, even if they are in standard unix-y location scares me a lot... > (I can think of lots of > reasons why it wouldn't be great on Wndows either.) I'm interested to hear these too since all the microsoft apps I know of tend to have a "one folder" model... > For example, site > administrators like to keep log files together and separate from other > files. As a site admin myself, I like to keep log files together, but on a per-project basis, I think it's a personal preference thing... > Even if things are all together, there's really no point in having > separate subdirectories, typically containing only one or 2 files, Yep, you've persuaded me on that :-) > single directory containing the few needed files directly. The only > exception to this for me would be to have a subdirectory for Python > modules, if you have instance specific Python modules. Indeed. Again, I prefer to have all non-standard-library modules and packages in the instance home, so different versions don't interfere with each other. Yes, this pattern is probably most suited to development environment, but being able to svn the whole instance and just check that out on the production servers is something I personally find very poweful. > Bit without these, you have something like: > > zope.conf > zopectl > runzope > debugzope > scriptzope > Data.fs > zope.log > > It is pretty clear that zope.conf is a configuration file, zope.log is a > log file, and that Data.fs. On Unix, It's pretty clear that the others > are scripts, because they're executable and, on Windows, they should > have .bat or .exe suffxes. Agreed, I care less about the folders than I thought ;-) Although if pressed I think I'd still prefer them than not... > I'm not sure if you are referring to more than scripts. I agree that we > shouldn't have put utility scripts in instances. No, it's the utility scripts that I think are a nightmare waiting to happen the first time one of them changes as part of a Zope upgrade. > I would argue that > only the ctl script should go in instances. The runzope, scriptzope, > and debugzope scripts could be completely generic and invoked by an > instance specific ctl script. Exactly :-) > This is what I do in my latest Zope 3 > buildout recipes. Are those recipes available anywhere? > Only for a particular definition of "works". No experienced Unix > administrator would say it works on Unix. I suspect that a professional > Windows server adminstrator would have similar concerns. I don't agree with either of these at the moment. What's the reasoning for wanting to spray files from one project all over the filesystem? > My original point was not to advocate a particular layout but to point > out that different layouts will be needed in different situations and > that mandating a particular layout was likely to cause problems. Yep, now that's something I strongly agree with :-) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Thu Mar 8 10:40:42 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 08 Mar 2007 09:40:42 +0000 Subject: [Web-SIG] windows daemon tools In-Reply-To: <20070307113414.5ee7384a@Fenix> References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> <20070307113414.5ee7384a@Fenix> Message-ID: <45EFDA1A.5000606@simplistix.co.uk> Rodrigo Senra wrote: > For symmetry's sake in Windows a Python service manager could simply > use SCManager API under the hood (through win32all) to get the job done, > still keeping a consistent cross-platform modus operandi. I would love to see this, particularly for Zope, although I sadly don't have the skill to implement :-( Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Thu Mar 8 10:55:49 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 08 Mar 2007 09:55:49 +0000 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> Message-ID: <45EFDDA5.4010205@simplistix.co.uk> Jim Fulton wrote: > On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote: > ... >> Personally, I don't care for the Paste Deploy syntax -- frankly >> it's almost barbaric. :) > > I don't mean to pick on you, but I really *hate* comments like this. That's okay ;-) > criticism. I'd appreciate it if we would all just ignore statements > like this and, preferably, stop making them. ...but I don't think this is. I'd much prefer to hear people's gut feelings, even if they can't justify them. It all gives indication. Yes, if only one person says "this sucks", then their opinion may not be worth changing the implementation for. However, if 50% of users said "this sucks", even if they couldn't explain why, that'd be something worth worrying about. > A few years back, we created a library to parse more sophisticated > apache-like syntax and I wish we hadn't. I'm glad ZConfig exists. > The ini/config format is > pretty standard and, IMO, really quite adequate. How does it handle nesting? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From fdrake at gmail.com Thu Mar 8 13:30:33 2007 From: fdrake at gmail.com (Fred Drake) Date: Thu, 8 Mar 2007 07:30:33 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45EFDDA5.4010205@simplistix.co.uk> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> Message-ID: <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com> On 3/8/07, Chris Withers wrote: > I'm glad ZConfig exists. Me too, though it does many things differently than if I'd had free reign. > How does it handle nesting? It doesn't, but an application can use explicit references to other sections. It doesn't take care of things magically without some additional help, for which we've avoided premature abstraction. The .ini format is working quite well for zc.buildout, I think. The support for layering multiple files is quite nice, and is completely explicit. -Fred -- Fred L. Drake, Jr. "Every sin is the result of a collaboration." --Lucius Annaeus Seneca From sidnei at enfoldsystems.com Thu Mar 8 15:01:03 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Thu, 8 Mar 2007 11:01:03 -0300 Subject: [Web-SIG] daemon tools In-Reply-To: <9783FC39-D805-4CDB-BB7A-D68BE963567C@zope.com> References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> <9783FC39-D805-4CDB-BB7A-D68BE963567C@zope.com> Message-ID: On 3/8/07, Jim Fulton wrote: > > On Mar 7, 2007, at 8:42 AM, Sidnei da Silva wrote: > > > On Windows, the NT Service Controller does all the dirty job. And it's > > pretty easy to write a service in Python that can run any application. > > The simplest Python service is shorter than 30 lines I think. > > Would such a controller: > > - Invoke the application as a subprocess, or > > - Be part pf the application. (For example, would it be more like > ll.daemon or zdaemon?) > > ... Well, it could be both really. But of course the easiest to integrate with (from the I-dont-want-to-learn-anything-about-windows perspective) would be the former. > > There's some stuff from zdaemon that would be useful though, and do > > not work on Windows today due to some over-unixism in zdaemon, like an > > interactive prompt and script runner as 'zopectl debug' and 'zopectl > > run', I'm sure those two don't need to know about 'fork' or signals. > > Note that the scope of zdaemon, as it's name implies was always > limited to Unix. If it is reasonable to do so, I'd be happy to see s > single tool that handled both cases, although if there is a choice > between a single tool that handled both cases adequately and 2 tools > that handled both cases well, I'd pick the later. > > Also note that the "script runner" feature you mention isn't part of > zdaemon. zdaemon has a subclassing interface, which is currently > undocumented, that Zope uses to add the "run" and "debug" commands. > These are Zope specific. Yeah, so I thought. > > As I mentioned on another thread, Zope uses 'signals' on *nix, and > > 'named events' on Windows, by means of the 'Signals' package in Zope. > > I'm not familiar with that. :) So that unifies Unix signals and > windows events? Interesting. Well, it doesn't 'unify them' in the sense that you still have to send Windows named events, but the event name indicates the expected signal, by using 'Zope--'. So an event named 'Zope-1214-9' means SIGKILL for the pid 1214 for example. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From janssen at parc.com Thu Mar 8 15:00:52 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 8 Mar 2007 06:00:52 PST Subject: [Web-SIG] daemon tools In-Reply-To: <20070307113414.5ee7384a@Fenix> References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local> <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com> <20070307113414.5ee7384a@Fenix> Message-ID: <07Mar8.060056pst."57996"@synergy1.parc.xerox.com> > For symmetry's sake in Windows a Python service manager could simply > use SCManager API under the hood (through win32all) to get the job done, > still keeping a consistent cross-platform modus operandi. That's what I do in UpLib. Works pretty well. Bill From rodsenra at gpr.com.br Thu Mar 8 15:02:44 2007 From: rodsenra at gpr.com.br (Rodrigo Senra) Date: Thu, 8 Mar 2007 11:02:44 -0300 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <45EFD930.1040406@simplistix.co.uk> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> Message-ID: <20070308110244.56b81bd5@Fenix> [ Chris Withers ]: |I do a lot of unix deployment, and the thought of |a buildout that sprays files all over the system, even if they are in |standard unix-y location scares me a lot... I am very sympathetic to the idea of keeping related thing together. But I have some use cases (counter-examples) to contribute: - multiple Zope instances sharing libraries, python modules, and Zope/Plone Products. These files might be placed out of the instance tree. - when the Unix Adm is **not SomeFramework-wise** there is (might be) a demand to keep backup-electable-stuff somewhere he/she/it wants (like /etc instead of /someApp/etc). Even if with keep the files inside app's tree, deploy scripts might have to create hard links outside that tree. - logs and data (like Data.fs).... see below |> For example, site administrators like to keep log files |> together and separate from other files. | |As a site admin myself, I like to keep log files together, but on a |per-project basis, I think it's a personal preference thing... |I don't agree with either of these at the moment. What's the reasoning |for wanting to spray files from one project all over the filesystem? - one optimization (we actually do) is to create different disk partitions. One optimized for *large* files (like logs and databases) and other for small files (like source code, libraries and config files). In spite of that, I would love to keep deploys *totally* self-contained. Nevertheless, I was not wise enough to workaround some of the use cases presented above ;o) best regards, Senra ------------- Rodrigo Senra GPr Sistemas http://www.gpr.com.br From benji at benjiyork.com Thu Mar 8 15:13:54 2007 From: benji at benjiyork.com (Benji York) Date: Thu, 08 Mar 2007 09:13:54 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <45EFD930.1040406@simplistix.co.uk> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> Message-ID: <45F01A22.5090301@benjiyork.com> Chris Withers wrote: > Jim Fulton wrote: >> Having everything in one folder is great for development. It isn't so >> good for deployment, at least not on Unix. > > Can you explain why? I do a lot of unix deployment, and the thought of a > buildout that sprays files all over the system, even if they are in > standard unix-y location scares me a lot... I think it depends on the people involved. As a developer I prefer everything in one place. Our system administrators have to manage lots of machines (hundreds) with lots of software on them (some we write, some third party). Their perspective is that if they want to find a log file it's better for it to be where all the log files are instead of trying to find the corner of the file system where that particular app is installed. This appears to be the preference of most unix admins (as evidenced by the various linux/unix standardization processes). -- Benji York http://benjiyork.com From chris at simplistix.co.uk Fri Mar 9 11:02:21 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 09 Mar 2007 10:02:21 +0000 Subject: [Web-SIG] ConfigParser for configuration In-Reply-To: <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com> Message-ID: <45F130AD.1000904@simplistix.co.uk> Fred Drake wrote: > On 3/8/07, Chris Withers wrote: >> I'm glad ZConfig exists. > > Me too, though it does many things differently than if I'd had free reign. You have free reign now, right? ;-) >> How does it handle nesting? > > It doesn't, but an application can use explicit references to other > sections. You mean like the format expected by logging.config.fileConfig? > It doesn't take care of things magically without some > additional help, for which we've avoided premature abstraction. Not sure what this means... Okay, so, say I have a config.ini and I want to have logging sections for using in logging.config.fileConfig and other sections for use by my app's config. How would I share the one config file between fileConfig and whatever my app uses to tickle ConfigParser? Would each section have to parse the file? Would the get confused about keys not designed for them? Can one config.ini include other .ini files in the same way ZConfig allows? > The .ini format is working quite well for zc.buildout, I think. The > support for layering multiple files is quite nice, and is completely > explicit. What is this support for layering multiple files? I couldn't find it anywhere in the ConfigParser docs :-S cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Fri Mar 9 11:05:19 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 09 Mar 2007 10:05:19 +0000 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <20070308110244.56b81bd5@Fenix> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> <20070308110244.56b81bd5@Fenix> Message-ID: <45F1315F.4000000@simplistix.co.uk> Rodrigo Senra wrote: > [ Chris Withers ]: > - multiple Zope instances sharing libraries, python modules, > and Zope/Plone Products. These files might be placed out of > the instance tree. Sometimes you want this, sometimes you don't ;-) You want it if you have lots of homogeneous projects that all use the same products and libraries. For me, it's much more common to need to isolate projects because they rely in specific versions of products and libraries and often break if they have access to the wrong one... > - when the Unix Adm is **not SomeFramework-wise** there is (might be) > a demand to keep backup-electable-stuff somewhere he/she/it > wants (like /etc instead of /someApp/etc). Even if with keep > the files inside app's tree, deploy scripts might have to create > hard links outside that tree. OK, this is a good argument for making the location selectable ;-) > - one optimization (we actually do) is to create different disk > partitions. One optimized for *large* files (like logs and > databases) and other for small files (like source code, libraries > and config files). I've never seen the need myself, what measurable differences has this made? > In spite of that, I would love to keep deploys *totally* self-contained. > Nevertheless, I was not wise enough to workaround some of the use cases > presented above ;o) Sounds like we really need to support both... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Fri Mar 9 11:06:30 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 09 Mar 2007 10:06:30 +0000 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <45F01A22.5090301@benjiyork.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> <45F01A22.5090301@benjiyork.com> Message-ID: <45F131A6.1090903@simplistix.co.uk> Benji York wrote: > with lots of software on them (some we write, some third party). Their > perspective is that if they want to find a log file it's better for it > to be where all the log files are instead of trying to find the corner > of the file system where that particular app is installed. Yeah, on the log front I have to agree... I've found myself more often just heading to /var/log/ instead of wanting to hunt elsewhere... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From jim at zope.com Fri Mar 9 13:58:06 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 9 Mar 2007 07:58:06 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45EF84AB.3040807@zetaweb.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com> <45EF84AB.3040807@zetaweb.com> Message-ID: <1E39B557-F94D-4BCF-9D05-CE5DBCC76C8B@zope.com> On Mar 7, 2007, at 10:36 PM, Chad Whitacre wrote: > James, > > Thanks for weighing in. > >>> I'd love to get some input who know a lot about what makes >>> deploying PHP apps so easy. >> >> In a past life I had a fair amount of experience working with >> and deploying PHP, so I'll throw in my $0.02. >> >> It's worth pointing out that a lot of the "PHP is easier" >> perception is largely just that -- a perception. > > I don't have tons of PHP experience, but I did just finish > working on a pretty sizable job, and the deployment was anything > but easy. Instead it was a brittle amalgam of XML, Apache conf, > and nasty PHP abstractions. My impression is that PHP is easy for > simple cases (unpack WordPress and go), but quickly gets ugly > when you start dealing with frameworks. > > So maybe Python is the opposite? Harder for the simple cases, but > more elegant in the more complicated scenarios. I don't think this is the case, or, I don't think it has to be. It would be interesting if PHP was simple to deploy for simple applications and complex to deploy for complex application. That would inform our discussion quite a bit, IMO, as I think it would be far easier for us to make Python easier to install for simple applications than it would be for us to make Python easier to install for complex applications. We could bring tools to bear that would be appropriate to the problem. Maybe this would be a good place to start. Dang, I wish I had time to. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Mar 9 14:52:38 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 9 Mar 2007 08:52:38 -0500 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <45EF9749.3070002@zetaweb.com> References: <45EF9749.3070002@zetaweb.com> Message-ID: On Mar 7, 2007, at 11:55 PM, Chad Whitacre wrote: > All, > > Windows > ======= > > Sidnei, et al.: your points are well-taken and your expertise > appreciated. Thanks! > > > pywebd > ====== > > Bob: I'm on board with your vision for a common server library > here. Count me in. > > > webctl/filesystem layout/config syntax > ====================================== > > This is looking less hopeful as a place to collaborate: > > - An executable needs a config file on the command line, and/or > a config file in a pre-determined place. > > - *Requiring* a config file on the command line is butt-ugly. > > - Our opinions regarding filesystem layout seem to be, um, > non-overlapping. You are missing another alternative. First, keep in mind that with setuptools, "executables" are just wrapper scripts that: - Set up sys.path - Import an entry point, and - Call the entry point These wrapper scripts are *automatically generated*! It is just as easy to generate wrapper scripts that pass the name of a configuration file to the entry point along with other arguments. This is in fact what we're doing. This means that the script joins the software configuration (represented by the entry point and eggs used) and the process configuration, represented by the configuration file. I'm very happy with how this is working for us. (If you're interested in the gory details, see: http://www.python.org/pypi/zc.zope3recipes In particular, to see an example of the sort of generated script I'm talking about, go to: http://www.python.org/pypi/zc.zope3recipes#log-files and scroll up.) > I'd like to venture one more round on this, however, before > giving up on it: > > - It might be the case that Zope only has a few files in an > INSTANCE_HOME, but I find myself putting quite a bit in a > site's userland: > > - I'll install Python packages in there wholesale, so I get > their scripts in bin/, and lots of modules in lib/python. Many Zope users put Python packages in their instance homes. I mentioned this in another note as a justification for a subdirectory. Personally, since all of the deployments I do are large and require multiple instances of the *same* application, I prefer to create a separate application installation and than create multiple instances of that. Most Zope users don't seem to need this however combine process instances and application instances into the single concept of Zope instance. > > - I have multiple configuration files in etc/ (as > discussed) along with templates in etc/templates/. We generally take the view that templates are part of the software and are managed in Python packages. ... > Keeping it all in svn means that a website is very nearly > self-contained and isolated, requiring not much besides > Python to be installed in the base system. This is great for > many-sites-on-one-server. Absolutely. We, of course, check everything into svn. We (ZC) use buildouts to assemble the parts we need, which are typically shared across many projects. > > - For one-site-on-many-servers, why does a Unix-y userland for > development conflict with deployment? > That is, why can't a > development userland simply be installed into /usr/local for > deployment? Surely logging differences could be handled in > configuration, no? > Because site administrators who actually run the servers and who get woken up in the middle of the night when something goes wrong want application files to be in standard places, like /etc, /var/log, and so on. These people are not developers. They are not well served by "self-contained" applications, which are, for them at least, only part of a much bigger system configuration. Also note that on multi-core multi-process servers, we have many instances of the same application on the same server, so what normally gets put in a traditional zope instance is split between an application definition and multiple process definitions. In deployment, we install the application definition as an RPM. We then use tools provided by the application definition to create instance configurations based on the particular machine's configuration. (For ZC, the machine configuration happens to come from a centralized database.) > - Besides, my proposal only specified two requirements: > > etc/.conf > lib/python > > Is there really a Unix sysadmin that would balk at this? Yes. > This > is all that's really needed for a common executable to get > your site online. Lay out the rest however you want. But you don't actually need this at all. > > - Jim, you hold particular distain for lib/python, but it's > probably the best example of my "standards enable tools to > evolve" argument: lib/python buys you distutils, setuptools, > easy_install, workingenv, etc. No, actually it doesn't. It is based on an out of date convention. workingenv doesn't use it. It uses lib/pythonx.x. Distutils doesn't really use it unless you confider the --home option (or whatever it's called). Distutls is happy to install almost anywhere using -- install-lib. easy_install wants to install into your system Python. lib/python is no easier to supply as an alternate install location than any other. lib/python violates "flat is better than nested" by introducing a pointless lib directory. > - This same principle makes sense of runzope, scriptzope, and > debugzope: standardize the file format (= fs layout), and > such tools fit perfectly in /usr/local/bin. Except that this isn't appropriate for deployment. When you need to do something different, system that assume things about file-system layout produce gordian knots. I speak from experience from work on bending zope installations to the will of the people with the beepers. > - Almost all of the Windows discussion has centered on daemons > vs. services. Sidnei, et al.: what does a "native" Windows > filesystem layout look like for a web application? Is using a > self-contained Unix-inspired layout faux pas? > > - As mentioned wrt PHP, users like familiar filesystem layouts. > Reaching agreement here improves our story for newcomers. I don't have a problem with people using whatever layout they want. I don't even object to having common layouts that are documented and taught. What I can't accept is a software framework that *requires* a particular layout to function. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Mar 9 15:07:47 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 9 Mar 2007 09:07:47 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <45EFD930.1040406@simplistix.co.uk> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> Message-ID: <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com> On Mar 8, 2007, at 4:36 AM, Chris Withers wrote: > Jim Fulton wrote: >> Having everything in one folder is great for development. It >> isn't so good for deployment, at least not on Unix. > > Can you explain why? Yes. See my response to Chad. > I do a lot of unix deployment, and the thought of a buildout that > sprays files all over the system, even if they are in standard unix- > y location scares me a lot... That's because you are a developer. I've worked for the last couple of years with our system administrators supporting major applications at Zope Corporation. For a long time, we did things *our* (the developers) way and they lived with it because they had no choice. As time wore on and I got to experience more of their pain, I realized that maybe they had a clue after all and that If I worked with them rather than complacently assuming that they didn't know the best way to deploy applications, my life would be easier. (Side note: Over time, our management has wised up and our SAs have a lot more power to tell us, the developers, what to do. Fortunately, over the same period, we have come to appreciate their position and so this isn't a problem. :) >> (I can think of lots of reasons why it wouldn't be great on Wndows >> either.) > > I'm interested to hear these too since all the microsoft apps I > know of tend to have a "one folder" model... Yeah, that's why I don't use Windows. :) For years, people word files ended up in the same directory with the word applications. If I was a windows server administrator, I would want the software to be separate from other artifacts. I'd want to be able to update or reinstall the software without losing configuration. I'd want configuration data to be managed separately. This, of course, is what the windows registry does. It puts all of the configuration in one place that is separate from the software install. I'd expect logs to be managed separately. >> single directory containing the few needed files directly. The >> only exception to this for me would be to have a subdirectory for >> Python modules, if you have instance specific Python modules. > > Indeed. Again, I prefer to have all non-standard-library modules > and packages in the instance home, so different versions don't > interfere with each other. Yes, this pattern is probably most > suited to development environment, but being able to svn the whole > instance and just check that out on the production servers is > something I personally find very poweful. >> This is what I do in my latest Zope 3 buildout recipes. > > Are those recipes available anywhere? http://www.python.org/pypi/zc.zope3recipes Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Fri Mar 9 15:14:40 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 9 Mar 2007 09:14:40 -0500 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <45EFDDA5.4010205@simplistix.co.uk> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> Message-ID: <5BA15D06-6EF0-4315-9721-9B793D1E03B8@zope.com> On Mar 8, 2007, at 4:55 AM, Chris Withers wrote: > Jim Fulton wrote: >> On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote: >> ... >>> Personally, I don't care for the Paste Deploy syntax -- frankly >>> it's almost barbaric. :) >> I don't mean to pick on you, but I really *hate* comments like this. > > That's okay ;-) > >> criticism. I'd appreciate it if we would all just ignore >> statements like this and, preferably, stop making them. > > ...but I don't think this is. I'd much prefer to hear people's gut > feelings, even if they can't justify them. That's OK over a drink. In an open discussion it is very very counter productive in my experience. > It all gives indication. Yes, if only one person says "this sucks", > then their opinion may not be worth changing the implementation > for. However, if 50% of users said "this sucks", even if they > couldn't explain why, that'd be something worth worrying about. Sure, but how do you fix anything if they don't say why it sucks? How do you make it better? How do you even know if they are trying to solve the same problem that you are? Or if they've actually tried the tool your talking about. >> The ini/config format is pretty standard and, IMO, really quite >> adequate. > > How does it handle nesting? Using cross-section references. So, rather than having an embedded section, you have an option that refers to another section (or collection of sections). Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From sidnei at enfoldsystems.com Fri Mar 9 15:26:51 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Fri, 9 Mar 2007 11:26:51 -0300 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com> Message-ID: On 3/9/07, Jim Fulton wrote: > On Mar 8, 2007, at 4:36 AM, Chris Withers wrote: > > I'm interested to hear these too since all the microsoft apps I > > know of tend to have a "one folder" model... > > Yeah, that's why I don't use Windows. :) That's not a good enough excuse. :) > For years, people word > files ended up in the same directory with the word applications. I think that predates my involvement with computers, or you're misremembering something. > If I was a windows server administrator, I would want the software to be > separate from other artifacts. Log files are usually separate from software. For example on XP, IIS 5.1 logs to C:\WINDOWS\system32\LogFiles. As you've already mentioned most configuration ends up on the registry. I don't see any mixing of software and artifacts going on. > I'd want to be able to update or > reinstall the software without losing configuration. Well-behaved software will never touch your configuration. I've developed several installers using Inno Setup and you always have the choice to say what files should be deleted or not on an uninstall. The Zope Installer for Windows never deletes the INSTANCE_HOME. > I'd want > configuration data to be managed separately. This, of course, is > what the windows registry does. It puts all of the configuration in > one place that is separate from the software install. I'd expect > logs to be managed separately. That's totally fine. It could go to C:\WINDOWS\system32\LogFiles too, or it could just log to the NT Event Log, and then you can configure all sorts of things related to for how long those log files are kept. There are also great tools that allow you to query those logs just like if they were SQL databases. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From fdrake at gmail.com Fri Mar 9 15:51:52 2007 From: fdrake at gmail.com (Fred Drake) Date: Fri, 9 Mar 2007 09:51:52 -0500 Subject: [Web-SIG] ConfigParser for configuration In-Reply-To: <45F130AD.1000904@simplistix.co.uk> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com> <45F130AD.1000904@simplistix.co.uk> Message-ID: <9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com> On 3/9/07, Chris Withers wrote: > You have free reign now, right? ;-) Heh. Compatibility is worth something, even to me. > You mean like the format expected by logging.config.fileConfig? I haven't looked at that in a long time, but I think that's right. Essentially, each user of configuration data has to know which portions of their own configuration contains references to other sections, and then chase those down (or pass them along) to use that information. This would take the form of "foramatter = verbose_formatter" and the [verbose_formatter] would have all the configuration data for the formatter. > > It doesn't take care of things magically without some > > additional help, for which we've avoided premature abstraction. > > Not sure what this means... The application itself has to understand that it's creating an arbitrarily nested structure from a simple (two-level) hierarchy. How that happens is part of the application, not a magical helper library. > Okay, so, say I have a config.ini and I want to have logging sections > for using in logging.config.fileConfig and other sections for use by my > app's config. > > How would I share the one config file between fileConfig and whatever my > app uses to tickle ConfigParser? Would each section have to parse the > file? Would the get confused about keys not designed for them? If you really want to use logging.config.fileConfig(), I'd suggest your app having something like "logging-configuration = /path/to/logging/config.ini", and using that to call the logging configuration with the indicated file. > Can one config.ini include other .ini files in the same way ZConfig allows? No. > What is this support for layering multiple files? I couldn't find it > anywhere in the ConfigParser docs :-S What this needs to be depends on the application. There's a simple layering included in ConfigParser (call read() with multiple filenames, or readfp() more than once), but that doesn't serve zc.buildout well. You can look in the zc.buildout documentation and code for what that does; look for "extends". -Fred -- Fred L. Drake, Jr. "Every sin is the result of a collaboration." --Lucius Annaeus Seneca From ianb at colorstudy.com Fri Mar 9 17:18:50 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 09 Mar 2007 10:18:50 -0600 Subject: [Web-SIG] ConfigParser Message-ID: <45F188EA.3070703@colorstudy.com> Since there's lots of talk of ConfigParser, I thought I'd note some code I've written that uses the basic API of ConfigParser but allows for some simple additions; in INITools (http://pythonpaste.org/initools/) specifically initools.configparser: http://pythonpaste.org/initools/initools/configparser.py.html It keeps track of filenames and line numbers so it's possible to give more detailed error messages (though only if you have access to the underlying config parser object), and though not enabled by default it also includes options for things like "extends" to overlap sections, and ${section:value} substitution. Unlike some of the other ConfigParser alternatives out there, it doesn't extend the ini syntax or the types that ini files deal in (i.e., only strings). -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From chad at zetaweb.com Fri Mar 9 19:22:58 2007 From: chad at zetaweb.com (Chad Whitacre) Date: Fri, 09 Mar 2007 13:22:58 -0500 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: References: <45EF9749.3070002@zetaweb.com> Message-ID: <45F1A602.3000104@zetaweb.com> Jim, First, your comments re: paying attention to sysadmins are well-taken. Thanks. > I don't have a problem with people using whatever layout they > want. I don't even object to having common layouts that are > documented and taught. What I can't accept is a software > framework that *requires* a particular layout to function. That's fair enough. So what if a proposed common executable acted like this: 1. A configuration file may be specified on the command line. 2. If no config file is named on the command line, then look for one in certain locations: /etc/.conf /usr/local/etc/.conf ~/etc/.conf ./etc/.conf 3. .conf does basic process config (address, user/group, threads, etc.) and hands off to a second-layer config (be it paste.ini, zope.conf, etc.) 4. The following are added to PYTHONPATH *if they exist*: ./lib/python2.x ./lib/python2.x/site-packages ./lib/python ./lib/python/site-packages Such an executable would satisfy me. Would it be flexible enough to meet your requirements? chad From jbauer at rubic.com Fri Mar 9 20:30:12 2007 From: jbauer at rubic.com (Jeff Bauer) Date: Fri, 09 Mar 2007 13:30:12 -0600 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <45F1A602.3000104@zetaweb.com> References: <45EF9749.3070002@zetaweb.com> <45F1A602.3000104@zetaweb.com> Message-ID: <45F1B5C4.4000808@rubic.com> Chad Whitacre wrote: > 2. If no config file is named on the command line, then look > for one in certain locations: > > /etc/.conf > /usr/local/etc/.conf > ~/etc/.conf > ./etc/.conf And possibly the current working directory: ./.conf -- Jeff Bauer Rubicon, Inc. From jim at zope.com Fri Mar 9 21:02:23 2007 From: jim at zope.com (Jim Fulton) Date: Fri, 9 Mar 2007 15:02:23 -0500 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com> Message-ID: On Mar 9, 2007, at 9:26 AM, Sidnei da Silva wrote: > On 3/9/07, Jim Fulton wrote: >> On Mar 8, 2007, at 4:36 AM, Chris Withers wrote: ... >> For years, people word >> files ended up in the same directory with the word applications. > > I think that predates my involvement with computers, or you're > misremembering something. Kids these days! I'm not misremembering. >> If I was a windows server administrator, I would want the software >> to be >> separate from other artifacts. > > Log files are usually separate from software. For example on XP, IIS > 5.1 logs to C:\WINDOWS\system32\LogFiles. As you've already mentioned > most configuration ends up on the registry. I don't see any mixing of > software and artifacts going on. I cleverly distracted with you with my snipe at windows. Bwahaha. The origin of this particular point was Chris saying that he thought single directory layouts worked for deployment on all platforms. I suggested that a professional Windows server administrator wouldn't like things in one directory. My point is that, as with Unix, software deployed on Windows separates configuration from logging, from software and so on. A normal windows application doesn't keep everything in one directory as we do on Windows. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jinty at web.de Mon Mar 12 06:41:43 2007 From: jinty at web.de (Brian Sutherland) Date: Mon, 12 Mar 2007 06:41:43 +0100 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <45F1A602.3000104@zetaweb.com> References: <45EF9749.3070002@zetaweb.com> <45F1A602.3000104@zetaweb.com> Message-ID: <20070312054143.GC5066@minipas.home> On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote: > That's fair enough. So what if a proposed common executable acted > like this: > > 1. A configuration file may be specified on the command line. +lots > 2. If no config file is named on the command line, then look > for one in certain locations: > > /etc/.conf > /usr/local/etc/.conf > ~/etc/.conf > ./etc/.conf Perhaps you might want to think about /etc//.conf, because applications generally grow config files. In this case the second-layer config. Postgresql even does: /etc/postgresql/${version}/${instance_name}/ So you can have many instances of many versions running at once. That makes upgrading much easier. > 3. .conf does basic process config (address, user/group, > threads, etc.) and hands off to a second-layer config (be it > paste.ini, zope.conf, etc.) Perhaps specify the second-layer config file location in the first layer config. > 4. The following are added to PYTHONPATH *if they exist*: > > ./lib/python2.x > ./lib/python2.x/site-packages > ./lib/python > ./lib/python/site-packages -1 Why not just write additional PYTHONPATH locations into the script when you create it? The thing that creates the executable should know where its putting the libraries. > Such an executable would satisfy me. Would it be flexible enough > to meet your requirements? > > > > > chad > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/jinty%40web.de > -- Brian Sutherland From jinty at web.de Mon Mar 12 06:26:40 2007 From: jinty at web.de (Brian Sutherland) Date: Mon, 12 Mar 2007 06:26:40 +0100 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <45F1A602.3000104@zetaweb.com> References: <45EF9749.3070002@zetaweb.com> <45F1A602.3000104@zetaweb.com> Message-ID: <20070312052640.GB5066@minipas.home> On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote: > Jim, > > First, your comments re: paying attention to sysadmins are > well-taken. Thanks. I was pointed to this conversation and would like to comment wearing my sysadmin hat about what I would like. How I think web applications should be installed on unix. Basically, I'll just go through what happens when I install apache, squid or postgres on linux. When I install an application that is a daemon, I want the following things to happen automatically: * A new user for the daemon to run as is created to protect the daemon from the other users and the other users from the system. * A default config is placed unless one already exists in /etc//*.conf * Directories are laid out according to the FHS, _with_the_correct_permissions_. * Logrotate config placed in /etc/logrotate.d/ * Initscripts placed in /etc/init.d and symlinked to /etc/rc*.d * Server is started * Hopefully logging is via syslog with reasonable rules in /etc/logcheck * SEL Policy (perhaps in future) * Upgrades from previous versions handled * Various other files placed around /etc And, when I de-install, I want all of these things cleaned up in the right way according to my specific flavor of Linux. Currently when installing Zope, because of the way the instance model is hardwired, I have to do a lot of these things manually. That's bad when you are working on a cluster of many hopefully identical machines. By now, it should be obvious that the details of this process are specific to my favorite distribution of Linux and that I install this as a sysadmin. Things are different if you are a developer, running BSD, or running Windows. Also, the infrastructure for doing all these things at install/deinstall time already exists in the packaging infrastructure of most Linux distributions. I think it would be a bad idea to duplicate this infrastructure and all it's os-specific variations in a pure python packaging infrastructure. At the moment, distutils and setuptools are the main interfaces between the packaging infrastructure and python applications. Buried deep in most packages is the line: pythonX.Y setup.py install --single-version-externally-managed --root=./ So, basically, I think that keeping sysadmins happy means maintaining compatibility/extending a distutils style installation. -- Brian Sutherland From jim at zope.com Mon Mar 12 15:01:20 2007 From: jim at zope.com (Jim Fulton) Date: Mon, 12 Mar 2007 10:01:20 -0400 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <20070312052640.GB5066@minipas.home> References: <45EF9749.3070002@zetaweb.com> <45F1A602.3000104@zetaweb.com> <20070312052640.GB5066@minipas.home> Message-ID: <376AD825-1180-4595-BFB5-49D5E78C34D6@zope.com> On Mar 12, 2007, at 1:26 AM, Brian Sutherland wrote: > On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote: >> Jim, >> >> First, your comments re: paying attention to sysadmins are >> well-taken. Thanks. > > I was pointed to this conversation and would like to comment > wearing my > sysadmin hat about what I would like. How I think web applications > should be installed on unix. Basically, I'll just go through what > happens when I install apache, squid or postgres on linux. > > When I install an application that is a daemon, There is an interesting subtlety here. I think of Zope (or applications built using Zope components) as applications that can be run as one or more daemons. To me, a daemon is a particular instance of an application, not the application itself. I (and my SAs) prefer to separate software installation from configuration. We prefer that these be 2 steps. We often run multiple daemons of the same application on a single machine. The configuration of these daemons (and cron jobs, and so on) are controlled from a central configuration database that is mostly independent of the software install. We don't want deamons installed automatically when an application is installed. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jinty at web.de Mon Mar 12 15:35:45 2007 From: jinty at web.de (Brian Sutherland) Date: Mon, 12 Mar 2007 15:35:45 +0100 Subject: [Web-SIG] windows, pywebd, webctl In-Reply-To: <376AD825-1180-4595-BFB5-49D5E78C34D6@zope.com> References: <45EF9749.3070002@zetaweb.com> <45F1A602.3000104@zetaweb.com> <20070312052640.GB5066@minipas.home> <376AD825-1180-4595-BFB5-49D5E78C34D6@zope.com> Message-ID: <20070312143545.GB4923@minipas.home> On Mon, Mar 12, 2007 at 10:01:20AM -0400, Jim Fulton wrote: > > On Mar 12, 2007, at 1:26 AM, Brian Sutherland wrote: > > > On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote: > >> Jim, > >> > >> First, your comments re: paying attention to sysadmins are > >> well-taken. Thanks. > > > > I was pointed to this conversation and would like to comment > > wearing my > > sysadmin hat about what I would like. How I think web applications > > should be installed on unix. Basically, I'll just go through what > > happens when I install apache, squid or postgres on linux. > > > > When I install an application that is a daemon, > > There is an interesting subtlety here. I think of Zope (or > applications built using Zope components) as applications that can be > run as one or more daemons. To me, a daemon is a particular instance > of an application, not the application itself. I (and my SAs) prefer > to separate software installation from configuration. We prefer that > these be 2 steps. We often run multiple daemons of the same > application on a single machine. The configuration of these daemons > (and cron jobs, and so on) are controlled from a central > configuration database that is mostly independent of the software > install. We don't want deamons installed automatically when an > application is installed. Then perhaps you are more interested in a structure like the one postgresql uses, where there is a namespace in /etc and /var/lib for the specific instance of postgres. All, however, are run as the same system user. Also, I'll note that a well designed packaging system should _never_ blindly overwrite already existing files in /etc, so I would implement your case as: * Install predefined configuration files in /etc * Install daemon package -- Brian Sutherland From chris at simplistix.co.uk Tue Mar 13 14:58:13 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 13 Mar 2007 13:58:13 +0000 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com> Message-ID: <45F6ADF5.2090905@simplistix.co.uk> Jim Fulton wrote: > >> I do a lot of unix deployment, and the thought of a buildout that >> sprays files all over the system, even if they are in standard unix-y >> location scares me a lot... > > That's because you are a developer. OK, I see what you mean now, although I think it's clear that whatever choices we make, they should (easily) allow both models... >>> This is what I do in my latest Zope 3 buildout recipes. >> >> Are those recipes available anywhere? > > http://www.python.org/pypi/zc.zope3recipes Great, thanks :-) Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Tue Mar 13 15:08:18 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 13 Mar 2007 14:08:18 +0000 Subject: [Web-SIG] ConfigParser for configuration In-Reply-To: <9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com> <45F130AD.1000904@simplistix.co.uk> <9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com> Message-ID: <45F6B052.9040903@simplistix.co.uk> Fred Drake wrote: > On 3/9/07, Chris Withers wrote: >> You have free reign now, right? ;-) > > Heh. Compatibility is worth something, even to me. Oh just BBB it ;-) > The application itself has to understand that it's creating an > arbitrarily nested structure from a simple (two-level) hierarchy. How > that happens is part of the application, not a magical helper library. Funny, I always appreciated the help from the not-so-magical library. Saves a lot of wheel re-inventing when doing config for various projects... > If you really want to use logging.config.fileConfig(), I'd suggest > your app having something like "logging-configuration = > /path/to/logging/config.ini", and using that to call the logging > configuration with the indicated file. OK. >> Can one config.ini include other .ini files in the same way ZConfig >> allows? > > No. :-/ > What this needs to be depends on the application. There's a simple > layering included in ConfigParser (call read() with multiple > filenames, or readfp() more than once), but that doesn't serve > zc.buildout well. You can look in the zc.buildout documentation and > code for what that does; look for "extends". Ah, I think I'm getting the picture now. So, basically, everything ends up in one dictionary, and you need to be careful nothing re-uses a key? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Tue Mar 13 15:15:18 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 13 Mar 2007 14:15:18 +0000 Subject: [Web-SIG] more comments on Paste Deploy In-Reply-To: <5BA15D06-6EF0-4315-9721-9B793D1E03B8@zope.com> References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> <5BA15D06-6EF0-4315-9721-9B793D1E03B8@zope.com> Message-ID: <45F6B1F6.9080109@simplistix.co.uk> Jim Fulton wrote: > >> It all gives indication. Yes, if only one person says "this sucks", >> then their opinion may not be worth changing the implementation for. >> However, if 50% of users said "this sucks", even if they couldn't >> explain why, that'd be something worth worrying about. > > Sure, but how do you fix anything if they don't say why it sucks? How > do you make it better? How do you even know if they are trying to solve > the same problem that you are? Or if they've actually tried the tool > your talking about. These are all good points and they're the tough ones to answer. I've often found people are justified in their opinions even if they can't find a way to communicate the reasons for those opinions... >>> The ini/config format is pretty standard and, IMO, really quite >>> adequate. >> >> How does it handle nesting? > > Using cross-section references. So, rather than having an embedded > section, you have an option that refers to another section (or > collection of sections). I finally get this now :-) I do still worry about trying to figure out who's using what key (in terms of config files with sections for more than one type of configuration in them, as ZConfig provides). Am I right in thinking the way to avoid this in ConfigParser is to have one file that references lots of other files? eg: [config] logging=logging.ini zodb=zodb.ini ...etc.. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From fdrake at gmail.com Tue Mar 13 15:38:26 2007 From: fdrake at gmail.com (Fred Drake) Date: Tue, 13 Mar 2007 09:38:26 -0500 Subject: [Web-SIG] ConfigParser for configuration In-Reply-To: <45F6B052.9040903@simplistix.co.uk> References: <45E8EB97.6090805@zetaweb.com> <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com> <45E99DC1.4010703@zetaweb.com> <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com> <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com> <45EFDDA5.4010205@simplistix.co.uk> <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com> <45F130AD.1000904@simplistix.co.uk> <9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com> <45F6B052.9040903@simplistix.co.uk> Message-ID: <9cee7ab80703130738l64c805b8ubf76459b9f32c821@mail.gmail.com> On 3/13/07, Chris Withers wrote: > So, basically, everything ends up in one dictionary, and you need to be > careful nothing re-uses a key? The result is (essentially) a dictionary of dictionaries, so no, there's no worry about overlapping keys across sections. -Fred -- Fred L. Drake, Jr. "Every sin is the result of a collaboration." --Lucius Annaeus Seneca From ianb at colorstudy.com Tue Mar 13 20:47:54 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 13 Mar 2007 14:47:54 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> References: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> Message-ID: <45F6FFEA.9080007@colorstudy.com> Phillip J. Eby wrote: >> basically, where each object type results in a new key in the >> environment and a new ad hoc specification to be made (e.g., >> wsgi.file_wrapper takes a block size, which is specific only to that >> case). > > Right. I'm specifically saying that a collection of individual > specifications is much *better* than a single overarching specification > generalized from a single example. Single use cases make bad general > specs. > > >> OK, the dict would avoid multiple different kinds of keys, and >> presumably they'd all have the same signature. Block size doesn't >> really make any sense to me as a common parameter. Content type >> should be a common parameter, as something like an lxml object can be >> serialized as either XML or HTML. I don't think any response headers >> are likely to effect the serialization... though with my specification >> that remains an application concern, so it doesn't have to be resolved >> in the specification. > > Please don't keep trying to generalize this. They're called > "specific-ations", not "general-izations". :) > > >>> Notice that this approach doesn't require any special protocol for >>> these wrappers -- just WSGI. It's simpler to specify, and simpler to >>> implement than what you propose, while addressing some of the open >>> issues. >> >> The specification isn't particularly long or complicated, IMHO. > > That's because it doesn't address any of the real issues -- they're all > deferred to your "open issues" section. That's why I don't think having > the specification adds any value over highlighting the existing WSGI > pattern for extending the response (i.e. server-supplied > iterator-wrappers). The open issues section has three issue. One is a matter of defining some naming convention, and as long as people *try* to match up their conventions it will work. The second has a proposed solution. The last is merely aesthetic. These are the "real issues" you are referring to? >> When playing with implementation I used type names, and actually I >> rather prefer them, but it's not always clear what name to use. For >> instance, "lxml", "lxml.etree", "lxml.etree.Element", and >> "lxml.etree._Element" all are reasonable names. Or "ElementTree", >> "ElementTree.Element", "ElementTree._Element", "xml.etree", >> "xml.etree.Element", and "xml.etree._Element". Or even something like >> "IElement" could make sense in some context (e.g., what if you can >> accept the overlapping interfaces of both lxml and ElementTree?) >> >> At least the actual type object seems easy enough. OTOH, there are >> actually cases when I'd like to say that I could accept a certain type >> without having to import the type. E.g., if I wanted to do an XSLT >> transformation, I *could* support several kinds of objects without >> requiring any of them (e.g., lxml, 4DOM, and Genshi Markup). > > These problems all stem from premature generalization. It's a trivial > problem to fix, however, if you are trying to share one particular > content type: just pick a key and use it! That's not much easier, really. It would still be documented, still needs to be implemented and defined properly. The biggest difference is that it needs to be done again for each type of object. > Libraries such as wsgiref can support this pattern by providing a > utility like "wrap_content(environ, content, default_wrapper, *keys)" > function that looks up "keys" to find a wrapper to use in place of the > default_wrapper. > > >>>> The same things apply to the parsing of ``wsgi.input``, specifically >>>> parsing form data. A similar strategy is presented to avoid >>>> unnecessarily reparsing that data. >>> I would rather offer an optional 'get_file_storage()' method or some >>> such as a blessed WSGI extension, than have such an open-ended "get >>> whatever you want from the input object" concept floating around. A >>> strategy which reinvents half of PEP 246 (the *old* PEP 246, before >>> it became almost as complicated as WSGI) seems like overkill to me. >> >> I don't really understand what you are proposing. > > That wsgi.input be allowed to have a 'get_file_storage()' method that > can be called by applications, and that calling it means the input > stream must not have been read and will no longer be readable. > > >> This part addresses the same issues as presented in >> http://wsgi.org/wsgi/Specifications/handling_post_forms >> >> I really don't *want* to write every wsgi.input to a temporary file >> just because someone else *might* want to reparse the input. I'd much >> rather do it lazily, as 99% of the time reparsing won't happen. > > I don't understand your complaint, as it seems unrelated to what I propose. I didn't understand what you were proposing, I think. I still don't really know what get_file_storage means. >>>> Other Possibilities >>>> ------------------- >>>> >>>> * You could simply parse everything ever time. >>>> * You could pass data through callbacks in the environment (but this >>>> can >>>> break non-aware middleware). >>>> * You can make custom methods and keys for each case. >>>> * You can use something other than WSGI. >>> And you can use the established WSGI method for adding semantics to a >>> response, using a middleware-supplied wrapper. I think this is >>> actually the best alternative. >> >> I really don't understand the advantage. > > It's simple: *specifications are a liability in the general case*. They > are supposed to be the record of negotiations between people who need to > co-operate, not an attempt to solve all possible problems. This certainly doesn't solve all possible problems, it only addresses one particular issue. > So, if your spec is only about how relatively tight-coupled WFC's (WSGI > framework components) talk to each other, it seems more properly the > business of a web framework, not WSGI. Most of the places I want to use this are *not* at the framework level. A simple example is just parsing form data without having to own the data, which is an outstanding issue with WSGI stacks, and can be done outside of a framework. Another is how to communicate non-string data while having graceful fallback for string data. This is of particular interest to me, as I turn WSGI into HTTP quite often, and there's definitely nothing but strings at that point. > However, it *is* WSGI (wsgi-onic?) for the authors of certain components > to get together and say, "hey let's agree on this wrapper protocol"... > or better yet, a wrapper *implementation*. > > This is way way better than having another spec. Every godforsaken new > spec attached to WSGI just makes the whole thing seem way too > complicated. In retrospect, I wish I hadn't supported some of the > options and doodads and whatnots that are in WSGI today. If I had it to > do over, WSGI would be a lot simpler. This is a wsgiorg. specification, not a wsgi., and it's not meant to solve all issues. It is meant to be implementation neutral. > However, it's not too late to stop adding new cruft -- and I consider > the idea of reinventing PEP 246 inside of WSGI to be cruft of a most > horrible kind. > > >> Best practice is fine, though of course still needs to be documented, >> as this is hardly a practice that people would naturally think about >> or implement. > > Well, it's in PEP 333. It's a nice idea, but as far as I know no one has actually used wsgi.file_wrapper. Though so far no one has paid very close attention to these kinds of performance issues either. I think using it in a useful way requires platform-specific twiddling that no one cares to do. >> But I don't really think that practice would be any simpler or >> easier to describe if done completely. In fact, I think it would take >> exactly the same amount of space to describe. > > Even if it *did*, it'd still be better. However, since it's not a spec, > it can be presented informally. Here's an example: > > "If you want to give applications underneath your middleware a chance to > return rich responses (i.e., objects instead of strings), follow the > pattern used for the WSGI 'file wrapper' object. That is, have your > server or middleware add an environ key with a wrapper API that can > convert the richer objects you're expecting into a standard WSGI > iterator. Then, your server can simply inspect the iterators it > receives to see if they are instances of your wrapper type, and pull out > the objects you want. In this way, if there is middleware between you > and the application returning the rich response that modifies the > response body, you will receive an iterator of a different type, which > you can process in the usual way. However, if you receive an instance > of your wrapper type, you will know that you can access the rich data > directly." > > Now, can you expand this into more of a tutorial, give more hints and so > on? Absolutely. It'd be a great idea to. But the basic idea is simple > and doesn't require rigorous definitions -- it just needs people to > publish what keys they're using and the *specifications thereof*. > > What you're trying to specify is effectively a *meta*-specification: > much more difficult to do well, and not nearly as useful to have in this > case. Except insofar as "type" is variable in my specification, I don't think it is substantially different. If no one cares about this, then I guess I can just put it under the httpencode namespace where it was before, but I don't see any reason to make it less general. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From pje at telecommunity.com Tue Mar 13 21:12:43 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 13 Mar 2007 15:12:43 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <45F6FFEA.9080007@colorstudy.com> References: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> At 02:47 PM 3/13/2007 -0500, Ian Bicking wrote: >The open issues section has three issue. One is a matter of defining some >naming convention, and as long as people *try* to match up their >conventions it will work. The second has a proposed solution. The last >is merely aesthetic. > >These are the "real issues" you are referring to? No - I'm saying that the real issues are all (and always) specific to the particular data type being exchanged. >That's not much easier, really. It would still be documented, still needs >to be implemented and defined properly. The biggest difference is that it >needs to be done again for each type of object. It has to be anyway. >I didn't understand what you were proposing, I think. I still don't >really know what get_file_storage means. It would return a cgi.file_storage encoding the request body. >It's a nice idea, but as far as I know no one has actually used >wsgi.file_wrapper. I believe that the Jython WSGI implementation provides one, or something analagous that wraps certain types of Java stream objects. >Except insofar as "type" is variable in my specification, I don't think it >is substantially different. That is indeed the substance of the difference - yours is a meta-specification, rather than a specification. As a result, it's more complicated to grasp than a pattern... and significantly more difficult to get *right*. And without examples, it's basically impossible to get right. >If no one cares about this, then I guess I can just put it under the >httpencode namespace where it was before, but I don't see any reason to >make it less general. It'll be worth making it general when there are more examples of the pattern to generalize from. As you pointed out yourself, there are very few at the moment. From ianb at colorstudy.com Tue Mar 13 21:14:37 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 13 Mar 2007 15:14:37 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> References: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> Message-ID: <45F7062D.6060209@colorstudy.com> Phillip J. Eby wrote: >> I didn't understand what you were proposing, I think. I still don't >> really know what get_file_storage means. > > It would return a cgi.file_storage encoding the request body. I still don't understand. Are you talking about cgi.FieldStorage? Are you talking about an implementation of something, or something in the environment? -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From pje at telecommunity.com Tue Mar 13 21:34:40 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 13 Mar 2007 15:34:40 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <45F7062D.6060209@colorstudy.com> References: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com> At 03:14 PM 3/13/2007 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>>I didn't understand what you were proposing, I think. I still don't >>>really know what get_file_storage means. >>It would return a cgi.file_storage encoding the request body. > >I still don't understand. Are you talking about cgi.FieldStorage? Oops. Yeah. That should be get_field_storage(), then. D'oh. Sorry about that. Obviously it's been a while since I've used one of thos directly. :) From ianb at colorstudy.com Tue Mar 13 22:15:46 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 13 Mar 2007 16:15:46 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com> References: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com> Message-ID: <45F71482.3050204@colorstudy.com> Phillip J. Eby wrote: > At 03:14 PM 3/13/2007 -0500, Ian Bicking wrote: >> Phillip J. Eby wrote: >>>> I didn't understand what you were proposing, I think. I still don't >>>> really know what get_file_storage means. >>> It would return a cgi.file_storage encoding the request body. >> >> I still don't understand. Are you talking about cgi.FieldStorage? > > Oops. Yeah. That should be get_field_storage(), then. D'oh. Sorry > about that. Obviously it's been a while since I've used one of thos > directly. :) OK, we're getting closer, but I'm *still* not entirely sure what you are proposing. Are you talking about adding a function to wsgiref that either parses the input with cgi.FieldStorage, or gets an existing parsed value? -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From pje at telecommunity.com Tue Mar 13 22:38:44 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 13 Mar 2007 16:38:44 -0500 Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware In-Reply-To: <45F71482.3050204@colorstudy.com> References: <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com> <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com> <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com> <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com> <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070313163754.027c8e28@sparrow.telecommunity.com> At 04:15 PM 3/13/2007 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>At 03:14 PM 3/13/2007 -0500, Ian Bicking wrote: >>>Phillip J. Eby wrote: >>>>>I didn't understand what you were proposing, I think. I still don't >>>>>really know what get_file_storage means. >>>>It would return a cgi.file_storage encoding the request body. >>> >>>I still don't understand. Are you talking about cgi.FieldStorage? >>Oops. Yeah. That should be get_field_storage(), then. D'oh. Sorry >>about that. Obviously it's been a while since I've used one of thos >>directly. :) > >OK, we're getting closer, but I'm *still* not entirely sure what you are >proposing. Are you talking about adding a function to wsgiref that either >parses the input with cgi.FieldStorage, or gets an existing parsed value? I was talking about defining a standard WSGI extension whereby the wsgi.input object could have such a method. From rodsenra at gpr.com.br Fri Mar 16 15:46:56 2007 From: rodsenra at gpr.com.br (Rodrigo Senra) Date: Fri, 16 Mar 2007 11:46:56 -0300 Subject: [Web-SIG] [Proposal] "website" and first-level conf In-Reply-To: <45F1315F.4000000@simplistix.co.uk> References: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com> <45EDC772.3090803@simplistix.co.uk> <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com> <45EFD930.1040406@simplistix.co.uk> <20070308110244.56b81bd5@Fenix> <45F1315F.4000000@simplistix.co.uk> Message-ID: <20070316114656.281c02e0@Fenix> |Rodrigo Senra : |> - multiple Zope instances sharing libraries, python modules, |> and Zope/Plone Products. These files might be placed out of |> the instance tree. [ Chris Withers ]: |Sometimes you want this, sometimes you don't ;-) Indeed. |Rodrigo Senra : |> - one optimization (we actually do) is to create different disk |> partitions. One optimized for *large* files (like logs and |> databases) and other for small files (like source code, libraries |> and config files). [ Chris Withers ]: |I've never seen the need myself, what measurable differences has this |made? I do not have quantitative results since I have done that separation from the start. But, since there are file systems optimized for a few large files and others for many small ones, it makes sense to trust FS people and make use of that ;o) Nevertheless, I see your point that without measurements it "might" no be worth the trouble. On the other hand, if you plan your partitons prior to any software installation the overhead is minimal and any (unmeasured) performance benefit if for free <0.5wink>. |Rodrigo Senra : |> In spite of that, I would love to keep deploys *totally* |> self-contained. Nevertheless, I was not wise enough to workaround |> some of the use cases presented above ;o) [ Chris Withers ]: |Sounds like we really need to support both... +1 Abra??o, Senra ------------- Rodrigo Senra GPr Sistemas http://www.gpr.com.br From graham.dumpleton at gmail.com Wed Mar 21 11:36:07 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Wed, 21 Mar 2007 21:36:07 +1100 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. Message-ID: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> When one is using CGI as a means of implementing a WSGI application, although one would return content through the iterable returned from the application or by calling write() method returned from start_response(), one could actually write to sys.stdout directly as well since that is where the WSGI adapter writes it to anyway. Obviously this isn't something that should be done but then the WSGI PEP doesn't say anything about code not writing to sys.stdout and more than likely at some point someone is going to think they can just use 'print' to have some debugging statements output where they think they will see them. In the case of CGI such output would wrongly end up in the response and screw things up. To clarify this, a future update to WSGI specification or this environment specification people have been talking about, should perhaps clarify what behaviour one can expect out of sys.stdin, sys.stdout and sys.stderr. In the case of sys.stdout, do people see it as being at least good practice, if not required by specification, that the WSGI adapter should ensure that sys.stdout cannot be written to directly or by using 'print' from a WSGI application. Thus, in a CGI adapter it would do something like: import sys class dummystdout: def write(self, *args): raise IOError("WSGI prohibits use of sys.stdout.") .... def run_with_cgi(application): ... stdout = sys.stdout sys.stdout = dummystdout() ... def write(data): ... stdout.write(data) stdout.flush() In other words, it saves a reference to sys.stdout for its own use and then replaces sys.stdout with a dummy file like object that raises an exception if written to in any way or flushed. Even in Apache where sys.stdout (if flushed) eventually makes its way to the Apache error log, it seems it would also be a good idea to disable sys.stdout. The idea here is that if all WSGI adapters ensured that sys.stdout wasn't usable you would reduce the possibility of someones code inadvertently using it with one server and have it seemingly work and then move to CGI and find it screws everything up. Thus we are sort of protecting people by locking down the environment a bit so application portability issues are more easily found. With sys.stdin, you have a similar issue with CGI whereby you don't want a WSGI application reading from it directly. Thus sys.stdin should probably also be replaced with a file like object that always returns EOF (empty string). Having sys.stdin do anything meaningful in a multiple process server system like Apache also doesn't make sense, although in the case of Apache it already ensures that stdin returns EOF. The tricky one is single process servers (which don't use sys.stdin like CGI), as people may want to use interactive debuggers such as pdb, although where a single process is actually multithreaded it could preclude that to a degree unless you can stop two interactive debuggers sessions being triggered at the same time. In Apache even if one configures it to use only one child process this will still not work. To get Apache to allow you to use pdb you have to run up httpd direct with -DONE_PROCESS option. Anyway, it may seem good practice for a WSGI adapter to still prevent use of sys.stdin unless configured explicitly to allow it and even then it might only allow it if the server is running in a mode whereby it would work. Finally, sys.stderr also presents problems of its own. Although wsgi.errors is provided with the request environment, this can't be used at global scope within a module when importing and also shouldn't be used beyond the life time of the specific request. Thus, there isn't a way to log stuff outside of a request and ensure it gets to the server log. One could try and mandate use of 'logging' module, but this isn't available in old versions of Python. Thus probably easier to say that a WSGI adapter should always ensure that sys.stderr is redirected to the server log. Only problem with this idea is that you can potentially get interleaving of text when multithreading is being used. What you need is for sys.stderr to be underlayed with thread specific log objects each with its own buffering mechanism that ensures that only complete lines of text get sent to the actual log file. For log object associated with threads created to service a request, easy enough to flush and cleanup such log object at the end of the request, but what to do about user created threads as harder to know when thread has finished and cleanup as necessary. Yes one could simply ignore the whole issue, but I feel that a good quality WSGI adapter/server should address these issues and either lock things down as appropriate to protect users from themselves or ensure that using them results in a sensible outcome. Anyone who appreciates what I am talking here got any opinions of their own about these issues? Graham From pywebsig at alan.kennedy.name Thu Mar 22 12:29:27 2007 From: pywebsig at alan.kennedy.name (Alan Kennedy) Date: Thu, 22 Mar 2007 11:29:27 +0000 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> Message-ID: <4a951aa00703220429t7abaca96i643f7ac2284fbc9e@mail.gmail.com> Graham, I thought I'd reply, so that we'd get replies from everyone else to tell me I'm wrong. All your points are good common-sense stuff. I think that all of your policies on stdin, stdout, and stderr are good, and are appropriate for a WSGI environment running inside an Apache server. Some small points. > ..... one could actually write to sys.stdout directly as > well since that is where the WSGI adapter writes it to anyway. I think it's a good idea to redirect stdout, and to document in your server/gateway documentation that you are doing so. I also think this is a server specific issue. > Anyway, it may seem good practice for a WSGI adapter to still prevent > use of sys.stdin unless configured explicitly to allow it and even > then it might only allow it if the server is running in a mode whereby > it would work. This should be a server-specific feature, that is documented. > Finally, sys.stderr also presents problems of its own. Although > wsgi.errors is provided with the request environment, this can't be > used at global scope within a module when importing and also shouldn't > be used beyond the life time of the specific request. Thus, there > isn't a way to log stuff outside of a request and ensure it gets to > the server log. One could try and mandate use of 'logging' module, but > this isn't available in old versions of Python. I don't think you need to worry about versions of python that don't have the logging module. Strictly speaking, WSGI requires python 2.2, because of iterators. So I think it's extremely unlikely that people will be running WSGI apps on pre-2.2 VMs. > What you need is for sys.stderr to be underlayed with thread > specific log objects each with its own buffering mechanism that > ensures that only complete lines of text get sent to the actual log > file. This is a server/gateway implementation detail. > Yes one could simply ignore the whole issue, but I feel that a good > quality WSGI adapter/server should address these issues and either > lock things down as appropriate to protect users from themselves or > ensure that using them results in a sensible outcome. Given how much talk there is of the WSGI "environment", I think it's good to raise these issues. Regards, Alan. From pje at telecommunity.com Thu Mar 22 16:30:00 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 22 Mar 2007 10:30:00 -0500 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <4a951aa00703220429t7abaca96i643f7ac2284fbc9e@mail.gmail.co m> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> Message-ID: <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com> At 11:29 AM 3/22/2007 +0000, Alan Kennedy wrote: >Strictly speaking, WSGI requires python 2.2, >because of iterators. Actually, it doesn't. The pre-2.2 iterator protocol is to be used in such cases: http://www.python.org/dev/peps/pep-0333/#supporting-older-2-2-versions-of-python From ianb at colorstudy.com Thu Mar 22 17:03:50 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 22 Mar 2007 11:03:50 -0500 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> Message-ID: <4602A8E6.6080805@colorstudy.com> Graham Dumpleton wrote: > When one is using CGI as a means of implementing a WSGI application, > although one would return content through the iterable returned from > the application or by calling write() method returned from > start_response(), one could actually write to sys.stdout directly as > well since that is where the WSGI adapter writes it to anyway. > > Obviously this isn't something that should be done but then the WSGI > PEP doesn't say anything about code not writing to sys.stdout and more > than likely at some point someone is going to think they can just use > 'print' to have some debugging statements output where they think they > will see them. In the case of CGI such output would wrongly end up in > the response and screw things up. Apparently I didn't ever fix up sys.stdout in my cgi-related code (I don't know if anyone actually uses it either), but I always intended to do so. Particularly because the resulting bugs will be totally weird and hard to understand if people do print stuff. I personally would capture stdout and put everything on stderr. > To clarify this, a future update to WSGI specification or this > environment specification people have been talking about, should > perhaps clarify what behaviour one can expect out of sys.stdin, > sys.stdout and sys.stderr. > > In the case of sys.stdout, do people see it as being at least good > practice, if not required by specification, that the WSGI adapter > should ensure that sys.stdout cannot be written to directly or by > using 'print' from a WSGI application. Thus, in a CGI adapter it would > do something like: > > import sys > > class dummystdout: > def write(self, *args): > raise IOError("WSGI prohibits use of sys.stdout.") > .... > > def run_with_cgi(application): > ... > > stdout = sys.stdout > sys.stdout = dummystdout() > > ... > > def write(data): > ... > stdout.write(data) > stdout.flush() > > In other words, it saves a reference to sys.stdout for its own use and > then replaces sys.stdout with a dummy file like object that raises an > exception if written to in any way or flushed. As an avid use of "print" for debugging, this would bug me. I would prefer just avoiding the CGI case where stdout goes to the client, and otherwise saying that the server should try to put stdout output someplace where it can be read. But it could very well be a console, not necessarily a log file. Or the same log file as stderr, or... something. > With sys.stdin, you have a similar issue with CGI whereby you don't > want a WSGI application reading from it directly. Thus sys.stdin > should probably also be replaced with a file like object that always > returns EOF (empty string). Having sys.stdin do anything meaningful in > a multiple process server system like Apache also doesn't make sense, > although in the case of Apache it already ensures that stdin returns > EOF. Yes, I don't see any real utility to sys.stdin, except potential confusion. > The tricky one is single process servers (which don't use sys.stdin > like CGI), as people may want to use interactive debuggers such as > pdb, although where a single process is actually multithreaded it > could preclude that to a degree unless you can stop two interactive > debuggers sessions being triggered at the same time. In Apache even if > one configures it to use only one child process this will still not > work. To get Apache to allow you to use pdb you have to run up httpd > direct with -DONE_PROCESS option. Well... that's all true. So I think this can be left up to the server. Any CGI server should protect the user from unintentional bypassing the server. Otherwise using sys.stdin probably implies some intention that we don't really need to get in the way of. > Finally, sys.stderr also presents problems of its own. Although > wsgi.errors is provided with the request environment, this can't be > used at global scope within a module when importing and also shouldn't > be used beyond the life time of the specific request. Thus, there > isn't a way to log stuff outside of a request and ensure it gets to > the server log. One could try and mandate use of 'logging' module, but > this isn't available in old versions of Python. Thus probably easier > to say that a WSGI adapter should always ensure that sys.stderr is > redirected to the server log. Only problem with this idea is that you > can potentially get interleaving of text when multithreading is being > used. What you need is for sys.stderr to be underlayed with thread > specific log objects each with its own buffering mechanism that > ensures that only complete lines of text get sent to the actual log > file. For log object associated with threads created to service a > request, easy enough to flush and cleanup such log object at the end > of the request, but what to do about user created threads as harder to > know when thread has finished and cleanup as necessary. I think sys.stderr and sys.stdout are fairly similar. wsgi.stderr *could* be improved over a simple stream (e.g., you could cache stuff written to it, and write it in one chunk that is all the errors for the request). But you could also just create some middleware that does that, writing to the server logs. > Yes one could simply ignore the whole issue, but I feel that a good > quality WSGI adapter/server should address these issues and either > lock things down as appropriate to protect users from themselves or > ensure that using them results in a sensible outcome. > > Anyone who appreciates what I am talking here got any opinions of > their own about these issues? I guess in practice this hasn't been a problem for me. In a CGI context these things certainly should be resolved because of the overlap. But very few people use a CGI server, so it doesn't seem to come up often. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From pywebsig at alan.kennedy.name Thu Mar 22 17:52:01 2007 From: pywebsig at alan.kennedy.name (Alan Kennedy) Date: Thu, 22 Mar 2007 16:52:01 +0000 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com> Message-ID: <4a951aa00703220952k6215b122vdd247b01c2e651cb@mail.gmail.com> [Alan Kennedy] >>Strictly speaking, WSGI requires python 2.2, >>because of iterators. [Phillip J. Eby] > Actually, it doesn't. The pre-2.2 iterator protocol is to be used in such > cases: > > http://www.python.org/dev/peps/pep-0333/#supporting-older-2-2-versions-of-python Dang! I knew I couldn't say anything on web-sig without being contradicted ;-) I am familiar with that section. I'm sure you remember writing this in the credits section: "Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython (well before the spec was finalized) helped to shape the "supporting older versions of Python" section". But if the users want their "modern" python applications to be portable everywhere on WSGI, e.g. returning (iterable) files as ouput, or generators, then they should really stick with 2.2+. But you are, of course, right about the pre-2.2 iterator protocol. I wrote modjy for jython 2.1 according to the PEP guidelines, and have had user reports that it works without modification on jython 2.2+. Regards, Alan. From pje at telecommunity.com Thu Mar 22 20:45:53 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 22 Mar 2007 14:45:53 -0500 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <4a951aa00703220952k6215b122vdd247b01c2e651cb@mail.gmail.co m> References: <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com> <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070322144429.02c5b8b8@sparrow.telecommunity.com> At 04:52 PM 3/22/2007 +0000, Alan Kennedy wrote: >But if the users want their "modern" python applications to be >portable everywhere on WSGI, e.g. returning (iterable) files as ouput, Actually, returning a file as output is a horrible idea, since it will massively reduce throughput, due to transmitting one line at a time to the web browser. :) From graham.dumpleton at gmail.com Thu Mar 22 22:03:04 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 23 Mar 2007 08:03:04 +1100 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <4602A8E6.6080805@colorstudy.com> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> <4602A8E6.6080805@colorstudy.com> Message-ID: <88e286470703221403s4ef424a8q14ac2ffb4ff74d36@mail.gmail.com> Thanks for all the input, gives me some things to think about. On 23/03/07, Ian Bicking wrote: > Graham Dumpleton wrote: > > In the case of sys.stdout, do people see it as being at least good > > practice, if not required by specification, that the WSGI adapter > > should ensure that sys.stdout cannot be written to directly or by > > using 'print' from a WSGI application. Thus, in a CGI adapter it would > > do something like: > > > > import sys > > > > class dummystdout: > > def write(self, *args): > > raise IOError("WSGI prohibits use of sys.stdout.") > > .... > > > > def run_with_cgi(application): > > ... > > > > stdout = sys.stdout > > sys.stdout = dummystdout() > > > > ... > > > > def write(data): > > ... > > stdout.write(data) > > stdout.flush() > > > > In other words, it saves a reference to sys.stdout for its own use and > > then replaces sys.stdout with a dummy file like object that raises an > > exception if written to in any way or flushed. > > As an avid use of "print" for debugging, this would bug me. I would > prefer just avoiding the CGI case where stdout goes to the client, and > otherwise saying that the server should try to put stdout output > someplace where it can be read. But it could very well be a console, > not necessarily a log file. Or the same log file as stderr, or... > something. Although using 'print' is handy. The reason I was making sys.stdout off limits and not just merging the output with sys.stderr, is that at least one Python web framework hijacks sys.stdout for their own purposes so that people can use 'print' to generate the actual content of the response. The package that does this is web.py (http://webpy.org/). Not sure if there are others which do this. Graham From ianb at colorstudy.com Thu Mar 22 22:06:56 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 22 Mar 2007 16:06:56 -0500 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <88e286470703221403s4ef424a8q14ac2ffb4ff74d36@mail.gmail.com> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> <4602A8E6.6080805@colorstudy.com> <88e286470703221403s4ef424a8q14ac2ffb4ff74d36@mail.gmail.com> Message-ID: <4602EFF0.1040201@colorstudy.com> Graham Dumpleton wrote: >> As an avid use of "print" for debugging, this would bug me. I would >> prefer just avoiding the CGI case where stdout goes to the client, and >> otherwise saying that the server should try to put stdout output >> someplace where it can be read. But it could very well be a console, >> not necessarily a log file. Or the same log file as stderr, or... >> something. > > Although using 'print' is handy. The reason I was making sys.stdout > off limits and not just merging the output with sys.stderr, is that at > least one Python web framework hijacks sys.stdout for their own > purposes so that people can use 'print' to generate the actual content > of the response. The package that does this is web.py > (http://webpy.org/). Not sure if there are others which do this. I don't know of any others. As a debugging tool I'm not as concerned, as if a web.py user used something I wrote I would have hopefully removed all prints -- if I hadn't, it would be a bug (not an uncommon bug, but a bug). And the web.py user just won't do this, because they'll instantly break their app. Paste also has something that will capture prints/sys.stdout and put it into the page that is served up (paste.debug.prints). That middleware strategy would probably work regardless of what the server does. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From graham.dumpleton at gmail.com Thu Mar 22 22:11:26 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 23 Mar 2007 08:11:26 +1100 Subject: [Web-SIG] Direct use of sys.stdout, sys.stderr and sys.stdin in WSGI application. In-Reply-To: <5.1.1.6.0.20070322144429.02c5b8b8@sparrow.telecommunity.com> References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com> <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com> <5.1.1.6.0.20070322144429.02c5b8b8@sparrow.telecommunity.com> Message-ID: <88e286470703221411w28a2434et46fabed2eca810c6@mail.gmail.com> On 23/03/07, Phillip J. Eby wrote: > At 04:52 PM 3/22/2007 +0000, Alan Kennedy wrote: > >But if the users want their "modern" python applications to be > >portable everywhere on WSGI, e.g. returning (iterable) files as ouput, > > Actually, returning a file as output is a horrible idea, since it will > massively reduce throughput, due to transmitting one line at a time to the > web browser. :) FWIW, in mod_wsgi I have a directive which allows one to optionally override the prescribed WSGI behaviour of flushing after every chunk returned. Instead, the data gets buffered up by Apache and written as a large block rather than small blocks. Obviously you cant use this if you intend streaming data and probably not a good idea if something is returning a huge amount of data, but added it if for some reason you are using some third party WSGI component which is written in a sloppy way and generates lots of small blocks and you cant change it easily or quickly. With minimal effort the directive allows you to quickly improve throughput while you perhaps address the issues in the WSGI component or add on top your own middleware component which does the buffering in some other way which suits the actual application better. Graham From graham.dumpleton at gmail.com Fri Mar 30 00:09:49 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 30 Mar 2007 08:09:49 +1000 Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no argument. In-Reply-To: <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local> References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local> Message-ID: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> Have cc'd this other to the web-sig list in case anyone wants to shoot me down. :-) On 30/03/07, Robert Brewer wrote: > > Robert, was doing some testing with CherryPy WSGI server and noted > > that if read() is called with no arguments on wsgi.input that it just > > seems to hang indefinitely. Is there a problem here or have I managed > > to stuff up very simple test. It works okay when I explicitly specific > > content length. > > That's right. We simply hand the (blocking, makefiled) socket to the app > as wsgi.input. PEP 333 says: > > "The server is not required to read past the client's > specified Content-Length, and is allowed to simulate > an end-of-file condition if the application attempts > to read past that point. The application should not > attempt to read more data than is specified by the > CONTENT_LENGTH variable." > > We chose to not simulate the EOF, requiring app authors do that for > themselves (mostly to give apps more flexibility). Note that the app > side of CherryPy handles this for you by default. But since the spec > clearly places the responsibility or checking content-length on the > application side, it seemed redundant to perform the check both on the > app side and the server side. As I believe I have pointed out on the Python web-sig list before, the statement: ""The application should not attempt to read more data than is specified by the CONTENT_LENGTH variable.""" is actually a bit bogus. This is because a WSGI middleware component or web server could be acting as an input filter and decompressing a content encoding of gzip for request. Since it knows the size will change but will not know what the new size would be, except by buffering it all, it by rights should remove CONTENT_LENGTH. This presents a problem for an application as no CONTENT_LENGTH then to rely on to know whether it has read to much input. If you leave CONTENT_LENGTH intact, it may think it has read everything when there is in fact more. Also, with chunked transfer encoding you will not have CONTENT_LENGTH either. I know you read it all in and buffer it so you can calculate it, but that prevents streaming with chunked encoding where content length may be based on a series of end to communications. Thus, an application should really be just ignoring CONTENT_LENGTH and just successively calling read() in some way until it returns an empty string. It can't really work reliably in any other way. I believe that the WSGI adapter should be required (not just allowed) to simulate EOF if it believes that no more input is available for that request. For example, it knows at low level that CONTENT_LENGTH was valid because no filtering by that point, or that in chunked encoding that null block has been sent. The adapter is the only place it will generally know that this is the case. The only time that CONTENT_LENGTH may be of interest to an application is if it is acting as a proxy to downstream web server as then it needs to put it in downstream request. If no CONTENT_LENGTH or chunked transfer encoding it would be forced to use chunked encoding for downstream request. FWIW, what I have come to the conclusion of is that read() with no arguments is used then rather than say attempt to read all input in in one go based on some content length, is that underneath the adapter should insert its own size argument transparently. This size would be based on some block size deemed to perhaps give best performance based on technology being used. Thus read() with no arguments would always return potentially partial data and not all data. This is valid because semantics of read() for a file like object is that one should call it until it returns an empty string as EOF indicator. WSGI PEP is ambiguous in that respect as it says it is a file like object but then says you aren't supposed to read more than CONTENT_LENGTH and that an adapter doesn't have to simulate to EOF. One may say that this overrides file like object properties, but the WSGI way will not work all the time. Graham From foom at fuhm.net Fri Mar 30 00:52:41 2007 From: foom at fuhm.net (James Y Knight) Date: Thu, 29 Mar 2007 18:52:41 -0400 Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no argument. In-Reply-To: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local> <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> Message-ID: On Mar 29, 2007, at 6:09 PM, Graham Dumpleton wrote: > On 30/03/07, Robert Brewer wrote: > >> We chose to not simulate the EOF, requiring app authors do that for >> themselves CherryPy's deveopers are correct: they are following the WSGI spec. It is your app that is broken. > As I believe I have pointed out on the Python web-sig list before, the > statement: > > ""The application should not attempt to read more data than is > specified by the CONTENT_LENGTH variable.""" > > is actually a bit bogus. This requirement comes from CGI. CGI scripts cannot support unknown data lengths (yes, this means no chunked transfer). CONTENT_LENGTH is required to be provided if there is data, and the server is not required to provide an EOF after reading CONTENT_LENGTH bytes. WSGI inherits the same restrictions. I do agree with you that this was a mistake. WSGI should require WSGI servers/gateway to provide an EOF for read(), always, and should make a break from CGI and declare that CONTENT_LENGTH=0 means no data and CONTENT_LENGTH empty/missing means undefined length. This is something which ought to be fixed for the next revision of WSGI. This makes it a tiny bit harder to write a CGI gateway, of course, but it's worth it in my opinion, for the reasons you describe. HOWEVER, given that the current WSGI spec does not specify that, apps *cannot* depend upon that behavior. If your app does an unbounded read (), it's wrong. And, by reference to the CGI spec, if a server omits CONTENT_LENGTH, and there is data, it is wrong. The server ought to return a 411 Length Required if you attempt to access a WSGI app and provide chunked data. And, indeed, server code I wrote is wrong in just this way: it can omit CONTENT_LENGTH when given chunked data on input. Spec-compliant WSGI apps would then assume there's no input data which will then cause data loss. Luckily nobody ever passes chunked data on input. :) James PS: what about the readline(size) problem? Are we just going to continue indefinitely pretending that it's okay that the spec forbids using readline(size) and that cgi.FieldStorage calls it? Perhaps a WSGI 1.1 fixing these issues would be a good idea? From graham.dumpleton at gmail.com Fri Mar 30 01:59:05 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 30 Mar 2007 09:59:05 +1000 Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no argument. In-Reply-To: References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local> <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> Message-ID: <88e286470703291659h2b933a35k5ca844b8f3d78eb@mail.gmail.com> On 30/03/07, James Y Knight wrote: > > On Mar 29, 2007, at 6:09 PM, Graham Dumpleton wrote: > > On 30/03/07, Robert Brewer wrote: > > > >> We chose to not simulate the EOF, requiring app authors do that for > >> themselves > > CherryPy's deveopers are correct: they are following the WSGI spec. > It is your app that is broken. Since my app is a ten line test program just to test what the CherryPy WSGI server does, I am not too concerned. :-) > This requirement comes from CGI. CGI scripts cannot support unknown > data lengths (yes, this means no chunked transfer). CONTENT_LENGTH is > required to be provided if there is data, and the server is not > required to provide an EOF after reading CONTENT_LENGTH bytes. WSGI > inherits the same restrictions. > > I do agree with you that this was a mistake. WSGI should require WSGI > servers/gateway to provide an EOF for read(), always, and should make > a break from CGI and declare that CONTENT_LENGTH=0 means no data and > CONTENT_LENGTH empty/missing means undefined length. This is > something which ought to be fixed for the next revision of WSGI. This > makes it a tiny bit harder to write a CGI gateway, of course, but > it's worth it in my opinion, for the reasons you describe. > > HOWEVER, given that the current WSGI spec does not specify that, apps > *cannot* depend upon that behavior. If your app does an unbounded read > (), it's wrong. And, by reference to the CGI spec, if a server omits > CONTENT_LENGTH, and there is data, it is wrong. The server ought to > return a 411 Length Required if you attempt to access a WSGI app and > provide chunked data. > > And, indeed, server code I wrote is wrong in just this way: it can > omit CONTENT_LENGTH when given chunked data on input. Spec-compliant > WSGI apps would then assume there's no input data which will then > cause data loss. Luckily nobody ever passes chunked data on input. :) > > James > > PS: what about the readline(size) problem? Are we just going to > continue indefinitely pretending that it's okay that the spec forbids > using readline(size) and that cgi.FieldStorage calls it? Perhaps a > WSGI 1.1 fixing these issues would be a good idea? At least we agree on the problems with the WSGI specification. My problem now is that in mod_wsgi do I implement it exactly as per the WSGI 1.0 specification and thus propagate these problems and limitations (and thereby block use of cgi.FieldStorage), or if we can get some forward looking consensus on what WGSI 1.1 should do, implement to that instead. I would rather address the problems now as in the Apache world, once an Apache module gets installed, especially by a web hosting provider, it stays at that version for ages. On the mod_python list we still have to deal with people using older versions of mod_python 2.7/3.0/3.1 which are many years old even though we are up to mod_python 3.3 now. I could also just implement what makes the most sense even if people don't want to agree on a general consensus that that is what WSGI 1.1 should do. As far as I can see so far, this would still be WSGI 1.0 compliant, but what is the point if a WSGI 1.0 compliant application can't make use of it and whereby WSGI 1.1 may never come out or be different anyway. Graham From ianb at colorstudy.com Fri Mar 30 02:19:44 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Mar 2007 19:19:44 -0500 Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no argument. In-Reply-To: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local> <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> Message-ID: <460C57A0.9080506@colorstudy.com> Graham Dumpleton wrote: > ""The application should not attempt to read more data than is > specified by the CONTENT_LENGTH variable.""" > > is actually a bit bogus. > > This is because a WSGI middleware component or web server could be > acting as an input filter and decompressing a content encoding of gzip > for request. Since it knows the size will change but will not know > what the new size would be, except by buffering it all, it by rights > should remove CONTENT_LENGTH. This presents a problem for an > application as no CONTENT_LENGTH then to rely on to know whether it > has read to much input. If you leave CONTENT_LENGTH intact, it may > think it has read everything when there is in fact more. I thought leaving it out might be a good way to indicate content-length-unknown, but now I'm not so sure. I think a better indication is "-1", which works with cgi.FieldStorage and lots of other code, and generally .read(-1) means "give me everything you have". -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From pje at telecommunity.com Fri Mar 30 02:30:37 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 29 Mar 2007 19:30:37 -0500 Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no argument. In-Reply-To: References: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local> <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com> Message-ID: <5.1.1.6.0.20070329191711.04129658@sparrow.telecommunity.com> At 06:52 PM 3/29/2007 -0400, James Y Knight wrote: >Perhaps a WSGI 1.1 fixing these issues would be a good idea? I would personally rather see a WSGI 2.0 that also gets rid of start_response(), write(), and perhaps adds better async support. I suspect that the current approach to using yield boundaries to indicate buffer flushing should be replaced with yielding an explicit flush request object. WSGI beginners seem to think that write() and yield are like "print" in CGI, and thus end up writing code that performs crappily on compliant servers. In retrospect, the "server push" use case is much less common and it's reasonable to have to do something explicit to support it. Middleware would also be happier if it could tell when the application really wanted to flush the output. Combining this with some way to yield "pauses" to better support async servers would be ideal. It would also be nice if you could cleanly adapt WSGI 1.0 to 2.0 and vice versa, as long as you're using a reasonable subset (i.e. a subset that doesn't care about some of the quirks we need to fix). From ianb at colorstudy.com Fri Mar 30 02:56:36 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Mar 2007 19:56:36 -0500 Subject: [Web-SIG] WSGI 2.0 Message-ID: <460C6044.2090602@colorstudy.com> Do we want to discuss WSGI 2.0? I added a wiki page here to list anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 I've listed the things I can remember, and copying here: start_response and write ------------------------ We could remove ``start_response`` and the writer that it implies. This would lead to a signature like:: def app(environ): return '200 OK', [('Content-type', 'text/plain')], ['Hello world'] That is, return a three-tuple of (status, headers, app_iter). It's relatively simple to provide adapters to and from this signature to the WSGI 1.0 signature. Optional keys (removing) ------------------------ Several keys are optional in WSGI, but required in CGI, in particular ``SCRIPT_NAME``, ``PATH_INFO`` and ``QUERY_STRING``. Also ``REMOTE_ADDR`` and ``SERVER_SOFTWARE`` are supposed to exist. Unknown-length wsgi.input ------------------------- There's no documented way to indicate that there *is* content in ``environ['wsgi.input']``, but the content length is unknown. A value of ``"-1"`` may work in many situations. A missing ``CONTENT_LENGTH`` doesn't generally work currently (it's assumed to mean 0 by much code). readline(size) -------------- Currently the specification does not require servers to provide ``environ['wsgi.input'].readline(size)`` (the size argument in particular). But ``cgi.FieldStorage`` calls readline this way, so in effect it is required. app_iter and threads -------------------- It's not clear if the app_iter must be used in the same thread as the application. Since the application is blocking, presumably *it* must be run all in one thread. This should be more explicitly documented. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From graham.dumpleton at gmail.com Fri Mar 30 03:10:17 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 30 Mar 2007 11:10:17 +1000 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <460C6044.2090602@colorstudy.com> References: <460C6044.2090602@colorstudy.com> Message-ID: <88e286470703291810n56593d3bq98ad7717f06b520e@mail.gmail.com> On 30/03/07, Ian Bicking wrote: > Do we want to discuss WSGI 2.0? I added a wiki page here to list > anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 > > I've listed the things I can remember, and copying here: > > ... > > Optional keys (removing) > ------------------------ > > Several keys are optional in WSGI, but required in CGI, in particular > ``SCRIPT_NAME``, ``PATH_INFO`` and ``QUERY_STRING``. Also > ``REMOTE_ADDR`` and ``SERVER_SOFTWARE`` are supposed to exist. Huh. Where does it say that SCRIPT_NAME can be optional in WSGI. I know it can be empty if mount point is the root of the web server, but that it can not be there at all is new to me. One other issue if aiming at supporting chunked encoding for a request, is how (if one even can) make available the trailing headers if present after the final null data block. Personally I am not sure this one is worth the trouble and may be quite hard to even implement with some web servers as they don't even provide them as a separate set of headers but simply merge them on top of the main request headers. Graham From foom at fuhm.net Fri Mar 30 03:35:22 2007 From: foom at fuhm.net (James Y Knight) Date: Thu, 29 Mar 2007 21:35:22 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <460C6044.2090602@colorstudy.com> References: <460C6044.2090602@colorstudy.com> Message-ID: On Mar 29, 2007, at 8:56 PM, Ian Bicking wrote: > readline(size) > -------------- > > Currently the specification does not require servers to provide > ``environ['wsgi.input'].readline(size)`` (the size argument in > particular). But ``cgi.FieldStorage`` calls readline this way, so in > effect it is required. I actually think a minor revision to WSGI should be issued immediately, the only change being that readline(size) is required to be implemented by servers/gateways, and bumping the rev number to 1.1. Leaving the spec as it is is basically a lie. You cannot implement a WSGI server now, without implementing readline(size) and expect apps to work. Adding this is a completely backwards compatible change, and is probably already implemented in most (all?) servers, so it shouldn't be controversial. James From pje at telecommunity.com Fri Mar 30 04:41:12 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 29 Mar 2007 21:41:12 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <460C6044.2090602@colorstudy.com> Message-ID: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote: >Do we want to discuss WSGI 2.0? I added a wiki page here to list >anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 > >I've listed the things I can remember, and copying here: > > >start_response and write >------------------------ > >We could remove ``start_response`` and the writer that it implies. This >would lead to a signature like:: > > def app(environ): > return '200 OK', [('Content-type', 'text/plain')], ['Hello world'] > >That is, return a three-tuple of (status, headers, app_iter). > >It's relatively simple to provide adapters to and from this signature to >the WSGI 1.0 signature. I think we also want to have a value you can yield from the app_iter to explicitly request that the buffer be flushed, and that we should reopen the discussion about values to be yielded to communicate with async servers, indicating that the iterator should be paused pending input or some other operation. Ideally, this should be done in a way that's easy for middleware to handle; a flush signal should be handled by the middleware *and* passed up the chain, while any other async signals would be passed directly up the chain (unless it's something like "pause for input" and the middleware controls the input). If we do this right, it should be easier to write middleware that works correctly with respect to buffering, since the issues of flushing and pausing now become explicit rather than implicit. (This should make it easier to teach/learn as well.) >It's not clear if the app_iter must be used in the same thread as the >application. Since the application is blocking, presumably *it* must be >run all in one thread. This should be more explicitly documented. Definitely. I think that we should not require thread affinity between the application and the app_iter -- my feeling at this point is that actual yielding is an edge case with respect to most WSGI apps. The common case WSGI application should be just returning a list or tuple with a single string in it, and not doing any complex iteration. Allowing the server more flexibility here is probably the better choice. Indeed, I'm not sure we should require thread affinity across invocations of app_iter.next(). From foom at fuhm.net Fri Mar 30 05:08:39 2007 From: foom at fuhm.net (James Y Knight) Date: Thu, 29 Mar 2007 23:08:39 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> Message-ID: On Mar 29, 2007, at 10:41 PM, Phillip J. Eby wrote: >> It's not clear if the app_iter must be used in the same thread as the >> application. Since the application is blocking, presumably *it* >> must be >> run all in one thread. This should be more explicitly documented. > > Definitely. I think that we should not require thread affinity > between the > application and the app_iter -- my feeling at this point is that > actual > yielding is an edge case with respect to most WSGI apps. The > common case > WSGI application should be just returning a list or tuple with a > single > string in it, and not doing any complex iteration. Allowing the > server > more flexibility here is probably the better choice. > > Indeed, I'm not sure we should require thread affinity across > invocations > of app_iter.next(). I recall last time this issue was considered, one of the fundamental problems is that, if the same thread isn't used for both the app and all app_iter.next invocations, sqlite cannot be used. (unless you don't call sqlite functions in the iterate part, of course). And I'm sure there's other libraries that are similarly thread-safe but only if you restrict yourself to a single thread per handle. That problem made me uncomfortable enough with using non-dedicated threads that I didn't attempt it. If WSGI 2.0 explicitly states that each call to the app's iterator can occur on a different thread, then I'd be more confident in telling people that it's their code that was broken. I suppose another flag could be added "wsgi.dedicated_thread" which is True only if every call to .next will be on the same thread as the call to your app. Of course that doesn't really help an app broken by it, just lets them error out early. James From ianb at colorstudy.com Fri Mar 30 06:11:33 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Mar 2007 23:11:33 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> Message-ID: <460C8DF5.20601@colorstudy.com> James Y Knight wrote: > > On Mar 29, 2007, at 10:41 PM, Phillip J. Eby wrote: > >>> It's not clear if the app_iter must be used in the same thread as the >>> application. Since the application is blocking, presumably *it* must be >>> run all in one thread. This should be more explicitly documented. >> >> Definitely. I think that we should not require thread affinity >> between the >> application and the app_iter -- my feeling at this point is that actual >> yielding is an edge case with respect to most WSGI apps. The common case >> WSGI application should be just returning a list or tuple with a single >> string in it, and not doing any complex iteration. Allowing the server >> more flexibility here is probably the better choice. >> >> Indeed, I'm not sure we should require thread affinity across invocations >> of app_iter.next(). > > I recall last time this issue was considered, one of the fundamental > problems is that, if the same thread isn't used for both the app and all > app_iter.next invocations, sqlite cannot be used. (unless you don't call > sqlite functions in the iterate part, of course). And I'm sure there's > other libraries that are similarly thread-safe but only if you restrict > yourself to a single thread per handle. This aspect of SQLite totally sucks. But I haven't encountered any other libraries with the same restrictions. I might just not notice -- quite possible -- but still, I haven't noticed it. And of course pre-fetching the results solves the problem. The advantages seem much more substantial than to make it worth it to cater to one stupid library. At least it *seems* like there's an advantage, in that an async server could handle lots of slow-consuming clients (or large responses) without a whole lot of overhead, because it could deal with all the app_iter's in a single thread. If that wouldn't work anyway, then it's no good, but I'm assuming that could work. > That problem made me uncomfortable enough with using non-dedicated > threads that I didn't attempt it. If WSGI 2.0 explicitly states that > each call to the app's iterator can occur on a different thread, then > I'd be more confident in telling people that it's their code that was > broken. I suppose another flag could be added "wsgi.dedicated_thread" > which is True only if every call to .next will be on the same thread as > the call to your app. Of course that doesn't really help an app broken > by it, just lets them error out early. That's essentially what wsgi.threaded and wsgi.multiprocess do. I think it's a reasonable thing to give, because there is some potential that you'd get incorrect data instead of an exception if there really was problematic code. And it would allow a SQLite user to at least call list() (or fetchall) on their app_iter. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From ianb at colorstudy.com Fri Mar 30 06:16:07 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Mar 2007 23:16:07 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <88e286470703291810n56593d3bq98ad7717f06b520e@mail.gmail.com> References: <460C6044.2090602@colorstudy.com> <88e286470703291810n56593d3bq98ad7717f06b520e@mail.gmail.com> Message-ID: <460C8F07.7000400@colorstudy.com> Graham Dumpleton wrote: > On 30/03/07, Ian Bicking wrote: >> Do we want to discuss WSGI 2.0? I added a wiki page here to list >> anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 >> >> I've listed the things I can remember, and copying here: >> >> ... >> >> Optional keys (removing) >> ------------------------ >> >> Several keys are optional in WSGI, but required in CGI, in particular >> ``SCRIPT_NAME``, ``PATH_INFO`` and ``QUERY_STRING``. Also >> ``REMOTE_ADDR`` and ``SERVER_SOFTWARE`` are supposed to exist. > > Huh. Where does it say that SCRIPT_NAME can be optional in WSGI. I > know it can be empty if mount point is the root of the web server, but > that it can not be there at all is new to me. "The following variables must be present, unless their value would be an empty string, in which case they may be omitted, except as otherwise noted below." It doesn't really say that SCRIPT_NAME and PATH_INFO are optional, but it doesn't clearly say they are not optional. QUERY_STRING specifically is optional, but there's a bug in cgi.FieldStorage if you ever do omit it, so you really shouldn't. And in the CGI spec QUERY_STRING is not optional. I actually don't like REMOTE_ADDR being required, as sometimes it is not applicable. For instance, if you are pre-requesting a resource or doing a totally internal request. I could imagine putting a non-IP address there, but I think it would be better simply to omit the variable. SERVER_SOFTWARE is mostly silly. > One other issue if aiming at supporting chunked encoding for a > request, is how (if one even can) make available the trailing headers > if present after the final null data block. Personally I am not sure > this one is worth the trouble and may be quite hard to even implement > with some web servers as they don't even provide them as a separate > set of headers but simply merge them on top of the main request > headers. Can you put this on the wiki? -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From ianb at colorstudy.com Fri Mar 30 06:30:58 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 29 Mar 2007 23:30:58 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> Message-ID: <460C9282.9020507@colorstudy.com> Phillip J. Eby wrote: > At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote: >> Do we want to discuss WSGI 2.0? I added a wiki page here to list >> anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 >> >> I've listed the things I can remember, and copying here: >> >> >> start_response and write >> ------------------------ >> >> We could remove ``start_response`` and the writer that it implies. This >> would lead to a signature like:: >> >> def app(environ): >> return '200 OK', [('Content-type', 'text/plain')], ['Hello >> world'] >> >> That is, return a three-tuple of (status, headers, app_iter). >> >> It's relatively simple to provide adapters to and from this signature to >> the WSGI 1.0 signature. > > I think we also want to have a value you can yield from the app_iter to > explicitly request that the buffer be flushed, and that we should reopen > the discussion about values to be yielded to communicate with async > servers, indicating that the iterator should be paused pending input or > some other operation. (this should probably be opened as a separate item from the signature change, as I don't think it relates much to that) I'd rather not introduce new objects, since we don't have any new objects yet. None is an obvious object, but it's vague in this context. To me it feels more like a pause than a flush. Flush really means *do* something, and None feels like the no-op, which is more like a pause. I've become interested in using WSGI middleware as an HTTP translating proxy, so the async opportunities are of more interest to me now. In part just the app_iter non-thread-affinity change would be helpful, I think. Dealing with large request bodies is harder, I think, because those would have to be processed before the WSGI app returned. But that's less concerning to me. It seems like if yielding None from an app_iter meant "put me at the back of the queue" that would be a fairly simple and effective way of handling async for large (or slow) response bodies. This wouldn't really work for the Twisted stuff where you keep a response open and trickle out data based on server-side events (because you can't control when you get back to the beginning of the queue), but otherwise it seems pretty good. I suppose full control could be allowed if you could do something like return an object that could be part of the event loop somehow. If we had some standard async-wrapping-key of some sort, perhaps. For example (I say with no real knowledge of Deferred): environ['wsgi.async_callback'] = EventMatcher # in the app: yield environ['wsgi.async_callback'](some_event) # in the server: for item in app_iter: if isinstance(item, EventMatcher): # queue up the app_iter, leaving it paused until something # matching that event happens I feel somehow that it could be useful for intermediaries to be able to filter out this callback, and so a documented key (or keys) would be good. But I can't quite place why I'd want to do that. Well, except that any intermediary would have to be able to detect this kind of object and pass it back up. So maybe instead of filtering it out of the environ, there needs to be some easy test that can be applied. What the event object looks like ("some_event"), I have no idea. > Ideally, this should be done in a way that's easy for middleware to > handle; a flush signal should be handled by the middleware *and* passed > up the chain, while any other async signals would be passed directly up > the chain (unless it's something like "pause for input" and the > middleware controls the input). > > If we do this right, it should be easier to write middleware that works > correctly with respect to buffering, since the issues of flushing and > pausing now become explicit rather than implicit. (This should make it > easier to teach/learn as well.) In terms of buffering, I can't think of many cases where it would matter. Either the middleware passes back the response with no changes, or it needs to consume the entire response body (and probably headers and maybe status) to do whatever transformation it needs to do. Things like pauses and async signals would ideally be passed upstream, but flushes and content would all be consumed by the middleware. >> It's not clear if the app_iter must be used in the same thread as the >> application. Since the application is blocking, presumably *it* must be >> run all in one thread. This should be more explicitly documented. > > Definitely. I think that we should not require thread affinity between > the application and the app_iter -- my feeling at this point is that > actual yielding is an edge case with respect to most WSGI apps. The > common case WSGI application should be just returning a list or tuple > with a single string in it, and not doing any complex iteration. > Allowing the server more flexibility here is probably the better choice. > > Indeed, I'm not sure we should require thread affinity across > invocations of app_iter.next(). It seems unlikely there'd be a need to move it between threads, but then it doesn't seem like there's much need for the application to have it all called in one thread either (i.e., if you move threads once, moving threads again shouldn't be a problem). -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From pje at telecommunity.com Fri Mar 30 18:46:38 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 30 Mar 2007 11:46:38 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070330112527.02c82c70@sparrow.telecommunity.com> At 11:08 PM 3/29/2007 -0400, James Y Knight wrote: >On Mar 29, 2007, at 10:41 PM, Phillip J. Eby wrote: >>>It's not clear if the app_iter must be used in the same thread as the >>>application. Since the application is blocking, presumably *it* >>>must be >>>run all in one thread. This should be more explicitly documented. >> >>Definitely. I think that we should not require thread affinity >>between the >>application and the app_iter -- my feeling at this point is that >>actual >>yielding is an edge case with respect to most WSGI apps. The >>common case >>WSGI application should be just returning a list or tuple with a >>single >>string in it, and not doing any complex iteration. Allowing the >>server >>more flexibility here is probably the better choice. >> >>Indeed, I'm not sure we should require thread affinity across >>invocations >>of app_iter.next(). > >I recall last time this issue was considered, one of the fundamental >problems is that, if the same thread isn't used for both the app and >all app_iter.next invocations, sqlite cannot be used. (unless you >don't call sqlite functions in the iterate part, of course). And I'm >sure there's other libraries that are similarly thread-safe but only >if you restrict yourself to a single thread per handle. Right -- but the point here is that you only need to *have* an iterator if you're doing server push or trying to stream large files. I don't mind making these corner cases a bit tougher to implement, since they're fairly tough already. If you're running a WSGI 1.0 app under a 2.0->1.0 adapter, you can always use an adapter that ensures thread affinity. Indeed, any 2.0->1.0 adapter that supports multiple write() calls is going to need to have some sort of threading mechanism anyway, unless it uses greenlets. >That problem made me uncomfortable enough with using non-dedicated >threads that I didn't attempt it. If WSGI 2.0 explicitly states that >each call to the app's iterator can occur on a different thread, then >I'd be more confident in telling people that it's their code that was >broken. I suppose another flag could be added "wsgi.dedicated_thread" >which is True only if every call to .next will be on the same thread >as the call to your app. Of course that doesn't really help an app >broken by it, just lets them error out early. I'd like to have fewer optional things, rather than more, so I think we should either require a dedicated thread or make it non-dedicated. It should be quite straightforward to implement a middleware component that ensures its wrappee is run entirely within a dedicated thread, using a Queue. From pje at telecommunity.com Fri Mar 30 19:06:23 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 30 Mar 2007 12:06:23 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <460C9282.9020507@colorstudy.com> References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com> At 11:30 PM 3/29/2007 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote: >>>Do we want to discuss WSGI 2.0? I added a wiki page here to list >>>anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 >>> >>>I've listed the things I can remember, and copying here: >>> >>> >>>start_response and write >>>------------------------ >>> >>>We could remove ``start_response`` and the writer that it implies. This >>>would lead to a signature like:: >>> >>> def app(environ): >>> return '200 OK', [('Content-type', 'text/plain')], ['Hello world'] >>> >>>That is, return a three-tuple of (status, headers, app_iter). >>> >>>It's relatively simple to provide adapters to and from this signature to >>>the WSGI 1.0 signature. >>I think we also want to have a value you can yield from the app_iter to >>explicitly request that the buffer be flushed, and that we should reopen >>the discussion about values to be yielded to communicate with async >>servers, indicating that the iterator should be paused pending input or >>some other operation. > >(this should probably be opened as a separate item from the signature >change, as I don't think it relates much to that) > >I'd rather not introduce new objects, since we don't have any new objects >yet. None is an obvious object, but it's vague in this context. To me it >feels more like a pause than a flush. Flush really means *do* something, >and None feels like the no-op, which is more like a pause. > >I've become interested in using WSGI middleware as an HTTP translating >proxy, so the async opportunities are of more interest to me now. In part >just the app_iter non-thread-affinity change would be helpful, I >think. Dealing with large request bodies is harder, I think, because >those would have to be processed before the WSGI app returned. But that's >less concerning to me. > >It seems like if yielding None from an app_iter meant "put me at the back >of the queue" that would be a fairly simple and effective way of handling >async for large (or slow) response bodies. This wouldn't really work for >the Twisted stuff where you keep a response open and trickle out data >based on server-side events (because you can't control when you get back >to the beginning of the queue), but otherwise it seems pretty good. I >suppose full control could be allowed if you could do something like >return an object that could be part of the event loop somehow. If we had >some standard async-wrapping-key of some sort, perhaps. For example (I >say with no real knowledge of Deferred): > >environ['wsgi.async_callback'] = EventMatcher ># in the app: >yield environ['wsgi.async_callback'](some_event) ># in the server: >for item in app_iter: > if isinstance(item, EventMatcher): > # queue up the app_iter, leaving it paused until something > # matching that event happens I was thinking of something a bit simpler; the environ key would be an object that, when called, tells the server that it's okay to resume iteration attempts on the application. A sort of "put me back on the queue for iteration" call. The callback would have to be safe to call from any thread at any time, and must not re-enter anything, just re-enable iteration. >I feel somehow that it could be useful for intermediaries to be able to >filter out this callback, and so a documented key (or keys) would be >good. But I can't quite place why I'd want to do that. Well, except that >any intermediary would have to be able to detect this kind of object and >pass it back up. So maybe instead of filtering it out of the environ, >there needs to be some easy test that can be applied. My thought is that flow control could be done with tuples whose first element is a number, and whose other elements are arguments. Why a number and not a string? So that if you forget to make it a tuple, it won't be sent as part of the output stream; it'll be detected as an error. Also, numbers are harder to assign and keep track of, and we want to have a very small set of strictly-defined flow control operations: pause (aka "nothing to report yet"), flush, and perhaps "wait for input". Alternatively, we could just go with numbers and not worry about tuples at all. I don't actually know of anything that needs an argument. >>Ideally, this should be done in a way that's easy for middleware to >>handle; a flush signal should be handled by the middleware *and* passed >>up the chain, while any other async signals would be passed directly up >>the chain (unless it's something like "pause for input" and the >>middleware controls the input). >>If we do this right, it should be easier to write middleware that works >>correctly with respect to buffering, since the issues of flushing and >>pausing now become explicit rather than implicit. (This should make it >>easier to teach/learn as well.) > >In terms of buffering, I can't think of many cases where it would >matter. Either the middleware passes back the response with no changes, >or it needs to consume the entire response body (and probably headers and >maybe status) to do whatever transformation it needs to do. > >Things like pauses and async signals would ideally be passed upstream, but >flushes and content would all be consumed by the middleware. I can't think of any condition where middleware would *not* pass all of these up to its caller. In the case of a "flush", it needs to first yield any buffered output, but it *must* still yield the flush. For example, if you're doing server push, then the app should yield a flush prior to each new content boundary. If the middleware is doing compression or some such, then it needs to restart encoding after each content boundary, as well as flush the prior encoded output. >>>It's not clear if the app_iter must be used in the same thread as the >>>application. Since the application is blocking, presumably *it* must be >>>run all in one thread. This should be more explicitly documented. >>Definitely. I think that we should not require thread affinity between >>the application and the app_iter -- my feeling at this point is that >>actual yielding is an edge case with respect to most WSGI apps. The >>common case WSGI application should be just returning a list or tuple >>with a single string in it, and not doing any complex iteration. >>Allowing the server more flexibility here is probably the better choice. >>Indeed, I'm not sure we should require thread affinity across invocations >>of app_iter.next(). > >It seems unlikely there'd be a need to move it between threads, In the case of Twisted, the easiest way to run possibly-blocking app code would be "deferToThread(app_iter.next)", and the code could end up running in any of several pooled threads, each time. So, really, the nominal case for Twisted is the one where you'd want there to be no need for affinity across iterations. >but then it doesn't seem like there's much need for the application to >have it all called in one thread either (i.e., if you move threads once, >moving threads again shouldn't be a problem). > > >-- >Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org > | Write code, do good | http://topp.openplans.org/careers From foom at fuhm.net Fri Mar 30 19:26:02 2007 From: foom at fuhm.net (James Y Knight) Date: Fri, 30 Mar 2007 13:26:02 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <5.1.1.6.0.20070330112527.02c82c70@sparrow.telecommunity.com> References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070330112527.02c82c70@sparrow.telecommunity.com> Message-ID: <047DEF82-CF27-4AA9-B611-0E8602E91C6D@fuhm.net> On Mar 30, 2007, at 12:46 PM, Phillip J. Eby wrote: >> I suppose another flag could be added "wsgi.dedicated_thread" >> which is True only if every call to .next will be on the same thread >> as the call to your app. Of course that doesn't really help an app >> broken by it, just lets them error out early. > > I'd like to have fewer optional things, rather than more, so I > think we should either require a dedicated thread or make it non- > dedicated. It should be quite straightforward to implement a > middleware component that ensures its wrappee is run entirely > within a dedicated thread, using a Queue. You can't *require* the server to switch threads every iteration. In fact I'm willing to bet many servers will continue using a dedicated thread even if they're explicitly allowed to not do so. So having some indication as to which the server is doing might be helpful. James From fumanchu at amor.org Fri Mar 30 19:32:19 2007 From: fumanchu at amor.org (Robert Brewer) Date: Fri, 30 Mar 2007 10:32:19 -0700 Subject: [Web-SIG] WSGI 2 and SERVER_PROTOCOL Message-ID: <435DF58A933BA74397B42CDEB8145A860AA41BF3@ex9.hostedexchange.local> RFC 2145 says: "An implementation of HTTP/x.b sending a message to a recipient whose version is known to be HTTP/x.a, a < b, MUST NOT depend on the recipient understanding a header not defined in the specification for HTTP/x.a. For example, HTTP/1.0 clients cannot be expected to understand chunked encodings, and so an HTTP/1.1 server must never send "Transfer-Encoding: chunked" in response to an HTTP/1.0 request." In specific cases, implementations can choose to send some HTTP/1.1 headers to HTTP/1.0 clients, but in the general case, the solution is usually to downgrade the entire HTTP response to 1.0 features only. Under WSGI, "an implementation of HTTP/x.b" is an emergent property of the entire stack; servers, middleware, and applications all share this responsibility to downgrade the entire response to HTTP/1.0 features if any of the other components is not HTTP/1.1 compliant. Unfortunately, the WSGI 1.0 spec doesn't require WSGI servers to tell WSGI applications what version of HTTP they support. If a WSGI origin server "fails to satisfy one or more of the MUST or REQUIRED level requirements for the protocols it implements" (as too many WSGI servers do!), WSGI applications have no standardized way of knowing this, and may output headers which contradict the version number output by the WSGI server. CherryPy hacks around this by having the origin server send a custom entry in the WSGI environ called "ACTUAL_SERVER_PROTOCOL", which tells the rest of the WSGI stack the version for which the origin server is at least conditionally compliant: # Compare request and server HTTP protocol versions, in case our # server does not support the requested protocol. Limit our output # to min(req, server). We want the following output: # request server actual written supported response # protocol protocol response protocol feature set # a 1.0 1.0 1.0 1.0 # b 1.0 1.1 1.1 1.0 # c 1.1 1.0 1.0 1.0 # d 1.1 1.1 1.1 1.1 # Notice that, in (b), the response will be "HTTP/1.1" even though # the client only understands 1.0. RFC 2616 10.5.6 says we should # only return 505 if the _major_ version is different. rp = int(req_protocol[5]), int(req_protocol[7]) sp = int(server.protocol[5]), int(server.protocol[7]) if sp[0] != rp[0]: self.simple_response("505 HTTP Version Not Supported") return # Bah. "SERVER_PROTOCOL" is actually the REQUEST protocol. environ["SERVER_PROTOCOL"] = req_protocol # set a non-standard environ entry so the WSGI app can know what # the *real* server protocol is (and what features to support). # See http://www.faqs.org/rfcs/rfc2145.html. environ["ACTUAL_SERVER_PROTOCOL"] = server.protocol self.response_protocol = "HTTP/%s.%s" % min(rp, sp) The "application-side" bits of CherryPy inspect this value (if present) and perform the same min(rp, sp) calculation as the server in order to determine which features to support. WSGI 2 should, at the least, add a standard environ entry similar to ACTUAL_SERVER_PROTOCOL. This would provide the minimum enforcement of full-stack compliance, since WSGI origin servers tend to be the least-compliant portions of any WSGI stack. As far as I am aware, the CherryPy 3 wsgiserver is the only one currently claiming to be even "conditionally compliant" with HTTP/1.1. WSGI 2 might, in addition, require WSGI origin servers to perform the min(rp, sp) calculation once and pass the result in a new "RESPONSE_PROTOCOL_SUPPORT" environ entry. Note this is not necessarily the same version number as what will be output in the response Status-Line: "An HTTP server SHOULD send a response version equal to the highest version for which the server is at least conditionally compliant, and whose major version is less than or equal to the one received in the request. An HTTP server MUST NOT send a version for which it is not at least conditionally compliant. A server MAY send a 505 (HTTP Version Not Supported) response if [it] cannot send a response using the major version used in the client's request." If a given WSGI application or middleware component is not at least conditionally compliant with HTTP/1.1, the WSGI origin server should downgrade the response version it emits in the Status-Line, but has no standardized way to be informed of this state of affairs. Currently, the burden tends to fall on those who compose WSGI stacks to manually instruct the WSGI origin server to always output HTTP/1.0 if any WSGI component is not conditionally compliant with HTTP/1.1. This issue may need to be addressed in a separate spec covering the composition of WSGI stacks. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From ianb at colorstudy.com Fri Mar 30 19:42:29 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 30 Mar 2007 12:42:29 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com> References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com> Message-ID: <460D4C05.9040404@colorstudy.com> Phillip J. Eby wrote: > I was thinking of something a bit simpler; the environ key would be an > object that, when called, tells the server that it's okay to resume > iteration attempts on the application. A sort of "put me back on the > queue for iteration" call. The callback would have to be safe to call > from any thread at any time, and must not re-enter anything, just > re-enable iteration. OK, that makes sense. So there's something like environ['wsgi.server_resume'] in the environment, and the app yields something that indicates a pause, then calls that value to undo the pause? >>> Ideally, this should be done in a way that's easy for middleware to >>> handle; a flush signal should be handled by the middleware *and* >>> passed up the chain, while any other async signals would be passed >>> directly up the chain (unless it's something like "pause for input" >>> and the middleware controls the input). >>> If we do this right, it should be easier to write middleware that >>> works correctly with respect to buffering, since the issues of >>> flushing and pausing now become explicit rather than implicit. (This >>> should make it easier to teach/learn as well.) >> >> In terms of buffering, I can't think of many cases where it would >> matter. Either the middleware passes back the response with no >> changes, or it needs to consume the entire response body (and probably >> headers and maybe status) to do whatever transformation it needs to do. >> >> Things like pauses and async signals would ideally be passed upstream, >> but flushes and content would all be consumed by the middleware. > > I can't think of any condition where middleware would *not* pass all of > these up to its caller. In the case of a "flush", it needs to first > yield any buffered output, but it *must* still yield the flush. Is there any use to this? If you are transforming output, the flush is unlikely to flush anything; all output will be buffered. > For example, if you're doing server push, then the app should yield a > flush prior to each new content boundary. If the middleware is doing > compression or some such, then it needs to restart encoding after each > content boundary, as well as flush the prior encoded output. I suppose server push is the only place where flush really matters, and most output transformations will simply break server push. As long as the async signals are easy to detect (e.g., an integer or tuple) then that's fine. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers From pje at telecommunity.com Fri Mar 30 20:14:57 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 30 Mar 2007 13:14:57 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <460D4C05.9040404@colorstudy.com> References: <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com> <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com> <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com> Message-ID: <5.1.1.6.0.20070330125052.042c64d0@sparrow.telecommunity.com> At 12:42 PM 3/30/2007 -0500, Ian Bicking wrote: >Phillip J. Eby wrote: >>I was thinking of something a bit simpler; the environ key would be an >>object that, when called, tells the server that it's okay to resume >>iteration attempts on the application. A sort of "put me back on the >>queue for iteration" call. The callback would have to be safe to call >>from any thread at any time, and must not re-enter anything, just >>re-enable iteration. > >OK, that makes sense. So there's something like >environ['wsgi.server_resume'] in the environment, and the app yields >something that indicates a pause, then calls that value to undo the pause? Yep. I guess we should distinguish here between "pause but poll" and "pause and wait for the callback". i.e., the operations might be something like: PAUSE_AND_POLL PAUSE_AND_WAIT FLUSH >>>>Ideally, this should be done in a way that's easy for middleware to >>>>handle; a flush signal should be handled by the middleware *and* passed >>>>up the chain, while any other async signals would be passed directly up >>>>the chain (unless it's something like "pause for input" and the >>>>middleware controls the input). >>>>If we do this right, it should be easier to write middleware that works >>>>correctly with respect to buffering, since the issues of flushing and >>>>pausing now become explicit rather than implicit. (This should make it >>>>easier to teach/learn as well.) >>> >>>In terms of buffering, I can't think of many cases where it would >>>matter. Either the middleware passes back the response with no changes, >>>or it needs to consume the entire response body (and probably headers >>>and maybe status) to do whatever transformation it needs to do. >>> >>>Things like pauses and async signals would ideally be passed upstream, >>>but flushes and content would all be consumed by the middleware. >>I can't think of any condition where middleware would *not* pass all of >>these up to its caller. In the case of a "flush", it needs to first >>yield any buffered output, but it *must* still yield the flush. > >Is there any use to this? If you are transforming output, the flush is >unlikely to flush anything; all output will be buffered. That depends on whether the transformation is of a streaming nature. If you're talking about things that e.g. apply XSL or some such, those are probably really MFCs rather than true middleware, and it's okay for an MFC to have more constraints on its wrapped application than transparent middleware does. >>For example, if you're doing server push, then the app should yield a >>flush prior to each new content boundary. If the middleware is doing >>compression or some such, then it needs to restart encoding after each >>content boundary, as well as flush the prior encoded output. > >I suppose server push is the only place where flush really matters, and >most output transformations will simply break server push. More precisely, they should just not apply their transformations to a multipart content type, unless they know how to handle it. However, there is another place where flow control matters, and that is streaming files which are too large to practically buffer in memory. Such files need a way to "suggest" that they be split into smaller blocks. Having a requirement that flow control be passed through allows us to ensure that middleware doesn't try to consume the whole response, you see. In WSGI 1.0, we handle this by treating *every* block as if it were followed by a flush, but in 2.0 I'd like to accomodate the fact that many people seem to think that yielding is like using "print" in CGI. I'm not married to the specific mechanism we use, but I *would* like to see WSGI 2.0 make it easy for middleware authors to comply in such a way as to handle streaming and push correctly. Hm. Maybe what we need is a way to specify the *type* of response, so that middleware can ignore what it can't handle... e.g.: def simple_app(environ) return resp_type, status, headers, content Then if the response type is STREAM or ASYNC, the middleware could opt out of it, returning the response as-is. OTOH, adding an extra return value seems like a pain when so few applications would use it, and so little middleware would care. Maybe it would be better to add something to the start of the status string, instead? E.g. "if status.startswith('!'): return original_response"? > As long as the async signals are easy to detect (e.g., an integer or > tuple) then that's fine. > >-- >Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org > | Write code, do good | http://topp.openplans.org/careers