From ianb at colorstudy.com  Sat Mar  3 02:17:33 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 02 Mar 2007 19:17:33 -0600
Subject: [Web-SIG] PasteDeploy comments
In-Reply-To: <DF790971-360D-49B7-A62E-6E42C36C595F@zope.com>
References: <DF790971-360D-49B7-A62E-6E42C36C595F@zope.com>
Message-ID: <45E8CCAD.2050008@colorstudy.com>

Jim Fulton wrote:
> I don't remember if we decided that these would be sent to just you or 
> to the Web SIG.  Since I didn't see any messages go to the Web SIG, I'll 
> assume we're just supposed to send these to you.

I suppose we could take this to Web-SIG.  For those who weren't at the 
PyCon mini-meeting we had, we talked about creating a cross-framework 
application server.  Basically the thing that deals with PID files, 
chuser, parts of connection handling, etc.  I don't think we've written 
up anything yet, but hopefully some people who were taking notes can 
expand.  Or... something.  Anyway, we talked about using Paste Deploy 
entry points for configuration.

> - I think you were a bit uncomfortable about the use of the  
> global_config argument to the factory functions.  I share this  
> discomfort a bit.  It seems a little odd to expose the configuration 
> mechanism this much.  It isn't a big deal for me.
> 
>   What have you used global configuration data for?

It's often meant for configuration that applies to many components.  For 
instance, a "debug" value that applies widely (or could also be applied 
locally).  Or information about where to email errors, some logging 
information, etc.  E.g., you might give a base directory for logging in 
global_conf, and an application could pick that up and probably put it 
in a subdirectory there (where if you configured it locally, you'd 
probably give the application the full path of the log file).

> - The semantics of paste.server_factory seem to be a little unclear. In 
> particular, I *assume* that the return value is expected to block when 
> run.  Is this true?  If so, then it makes it hard to have more  than one 
> server.  I know that you aren't fond of the idea of having  multiple 
> servers, but a lot of other folks seem to want it. :) In any case, the 
> semantics of the return value need to be documented.

paste.server_factory should be expanded, in part for what you are 
proposing (starting multiple servers).  Also, it seems like there should 
be a better way to shut it down than killing the entire process.  For 
instance, for performance testing.

For multiple servers, I'd generally rather have servers support multiple 
sockets, though this is a little hard in Paste Deploy (you'd might have 
to use a set of prefixes for configuring each, if you have configuration 
that is port-specific).  But I don't think there's anything wrong with 
starting multiple servers, if you really are starting truly different 
servers.

This could all be done in the same entry point, with optional methods 
(instead of just __call__ being specified), or a new entry point (which 
might be a bit more explicit).

> - If multiple servers are supported, then there will need to be a way to 
> specify which applications are used with which servers.

As long as the connection data is there, you can dispatch later (if you 
want to at all).  For instance, most people want http and https to serve 
the same application.

In paste.urlmap configuration I allow things like (in addition to path 
dispatch):

   domain foo = foo_app
   port 443 = https_app
   domain bar port 8888 = test_app

But you can also easily send everything to the same place, or a group of 
things to the same place.  I find this generally more convenient than 
building dispatch any further down.

Arguably the config syntax could support urlmap more natively.  E.g., 
allow sections like [app:/blog].  This could be turned into urlmap 
construction.  Assuming you don't care about the order in which 
middleware is applied, you could have [filter:/blog] automatically wrap 
that application.  (With multiple middleware on the same location, I 
suppose you'd have to supply some qualifier.)

> Overall, PasteDeploy looks very usable.  I'll probably find other issues 
> when I actually try to use it.  One of my next projects wil be to look 
> at how to use it in Zope.  zope.paste is a bit too much of a wedge.

zope.paste, as I remembered it, didn't really seem to allow things like 
instantiating multiple Zope applications.  But I can't remember.  And 
that's not always feasible; Zope 2 is unlikely to really support many 
truly separate instantiated applications, but it could still support the 
basic configuration.

Also note that in practice usually an application presents the entry 
point directly, and the framework provides functions to make 
application-specific entry points easy to write.

> On a related note, I'll probably want to do process configuration in the 
> same file that that PasteDeploy uses. This would likely include things 
> like:
> 
> - interrupt-check-interval
> 
> - Log files
> 
> I guess there is nothing to prevent this.  I suspect that I'll also get 
> a lot of resistence to moving this out of zope.conf. :/

Yes, the container configuration.  (Incidentally, what exactly do we 
call this thing we're proposing to make?)

> Have you tried pointing logging.fileConfig at a cnfig file containing 
> PasteDeplot sections?  I assume it would work.

I haven't tried it, but I think Ben Bangert has started work on that, 
using global_conf['__file__'] that way.  A more cohesive logging story 
that included that would be nice.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From chad at zetaweb.com  Sat Mar  3 04:29:27 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Fri, 02 Mar 2007 22:29:27 -0500
Subject: [Web-SIG] more comments on Paste Deploy
Message-ID: <45E8EB97.6090805@zetaweb.com>

All,

Thanks, Jim and Ian, for bringing this discussion online.

I have two hesitations with Paste Deploy:

   1. The configuration syntax is really complex. I'm much more
      comfortable with multiple simpler config files.

   2. I'm not clear on how Paste Deploy's abstractions map to the
      filesystem. What does my website root look like?


With Aspen, I went with a well-defined filesystem layout (a 
Unix-style userland) and multiple configuration files (in etc/), 
each with their own simple syntax.

So if you publish a blog app called SuperBlog, let's say, you 
would mount it in etc/apps.conf, e.g.:

   / 	 myapp:root
   /blog  superblog:main

SuperBlog would configure itself with etc/superblog.conf, a file 
with a simple syntax described in your SuperBlog documentation. 
SuperBlog also has access to Aspen's global config through a 
simple API.

I suggest that a system with multiple simple config files is much 
more scalable than a single complex config file syntax. Imagine 
if all of Unix were configured using a single syntax!


Also, I don't think we should underestimate the importance of the 
file/executable distinction. A standard "file format" for a 
website enables a wider tool ecosystem to evolve: interactive 
shells, debuggers, test runners, skel systems, configuration UIs. 
It also makes any given website easier to comprehend and maintain.


So in short, I give Paste Deploy a -1 as our main configuration 
system. I'd like the first-line config to be much simpler, with 
Paste Deploy available as an optional extra.


chad

From chad at zetaweb.com  Sat Mar  3 14:09:23 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Sat, 03 Mar 2007 08:09:23 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E8EB97.6090805@zetaweb.com>
References: <45E8EB97.6090805@zetaweb.com>
Message-ID: <45E97383.9090905@zetaweb.com>

 > A standard "file format" for a website enables a wider tool
 > ecosystem to evolve: interactive shells, debuggers, test
 > runners, skel systems, configuration UIs.

Not to mention existing tools like workingenv, distutils, ...


From jim at zope.com  Sat Mar  3 16:04:51 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 3 Mar 2007 10:04:51 -0500
Subject: [Web-SIG] PasteDeploy comments
In-Reply-To: <45E8CCAD.2050008@colorstudy.com>
References: <DF790971-360D-49B7-A62E-6E42C36C595F@zope.com>
	<45E8CCAD.2050008@colorstudy.com>
Message-ID: <EB6BBB04-FE68-4B8B-8861-8C4AD99133B1@zope.com>


On Mar 2, 2007, at 8:17 PM, Ian Bicking wrote:

> Jim Fulton wrote:
>>   What have you used global configuration data for?
>
> It's often meant for configuration that applies to many  
> components.  For instance, a "debug" value that applies widely (or  
> could also be applied locally).  Or information about where to  
> email errors, some logging information, etc.  E.g., you might give  
> a base directory for logging in global_conf, and an application  
> could pick that up and probably put it in a subdirectory there  
> (where if you configured it locally, you'd probably give the  
> application the full path of the log file).

I know what it's meant for.  I was asking what it was actually *used*  
for.  Is this truly useful?


>
>> - The semantics of paste.server_factory seem to be a little  
>> unclear. In particular, I *assume* that the return value is  
>> expected to block when run.  Is this true?  If so, then it makes  
>> it hard to have more  than one server.  I know that you aren't  
>> fond of the idea of having  multiple servers, but a lot of other  
>> folks seem to want it. :) In any case, the semantics of the return  
>> value need to be documented.
>
> paste.server_factory should be expanded, in part for what you are  
> proposing (starting multiple servers).

Cool

>   Also, it seems like there should be a better way to shut it down  
> than killing the entire process.  For instance, for performance  
> testing.

<shrug>  This doesn't seem important to me.

...

>> Overall, PasteDeploy looks very usable.  I'll probably find other  
>> issues when I actually try to use it.  One of my next projects wil  
>> be to look at how to use it in Zope.  zope.paste is a bit too much  
>> of a wedge.
>
> zope.paste, as I remembered it, didn't really seem to allow things  
> like instantiating multiple Zope applications.  But I can't  
> remember.  And that's not always feasible; Zope 2 is unlikely to  
> really support many truly separate instantiated applications, but  
> it could still support the basic configuration.

zope.paste tries very hard to minimize its impact on zope  
configuration.  It has to make a number of compromises to do this.   
It is impossible to run "truly separate" Python applications in the  
same process, for some definition of "truly separate" and  
"application".  separate WSGI applications will share common module  
definitions and shared module globals.  I can easily imagine separate  
Zope (2 & 3) applications that exposed separate object spaces or sets  
of procedural (as opposed to object-based) pages.


>> On a related note, I'll probably want to do process configuration  
>> in the same file that that PasteDeploy uses. This would likely  
>> include things like:
>> - interrupt-check-interval
>> - Log files
>> I guess there is nothing to prevent this.  I suspect that I'll  
>> also get a lot of resistence to moving this out of zope.conf. :/
>
> Yes, the container configuration.  (Incidentally, what exactly do  
> we call this thing we're proposing to make?)

I'm not sure we're initially proposing to make *a* thing. For  
starters I think we're exploring using the PasteDeploy-defined  
frameworks and to collaborate on sever testing.

I would call this the main program, but maybe other terms would be  
better.

>> Have you tried pointing logging.fileConfig at a cnfig file  
>> containing PasteDeplot sections?  I assume it would work.
>
> I haven't tried it, but I think Ben Bangert has started work on  
> that, using global_conf['__file__'] that way.  A more cohesive  
> logging story that included that would be nice.

I think this should be done by the main program (container/whatever)  
not by an application.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Sat Mar  3 16:21:28 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 3 Mar 2007 10:21:28 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E8EB97.6090805@zetaweb.com>
References: <45E8EB97.6090805@zetaweb.com>
Message-ID: <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>


I'll respond in a high-level way.

I believe, we're evaluating Paste Deploy at 2 levels:

1. Can we agree on a standard set of entry points so that WSGI  
applications can be combined automatically?  I think Paste Deploy  
provides at least good start on this.

2. Do we want to reuse it's configuration syntax.

You haven't commented on the entry points defined by Paste Deploy.   
Do you have an opinion on adopting the entry-point API defined by  
Paste Deploy?

On the subject of configuration format, I suppose this is a matter of  
taste.  I strongly prefer having fewer configuration files,  
preferably one.  One of the things I like about zc.buildout is that  
it lets me gather my configuration in one file.
The configuration format used by Paste Deploy is a simple standard  
format used by many many systems inside and outside the Python  
community.  This makes it easy for people to learn and understand.   
Obviously, we can agree to disagree on this.

I'd very much like, at a minimum, to agree on the entry point API so  
we can more easily collaborate on interoperable applications,  
middlewear, and servers.

Jim


On Mar 2, 2007, at 10:29 PM, Chad Whitacre wrote:

> All,
>
> Thanks, Jim and Ian, for bringing this discussion online.
>
> I have two hesitations with Paste Deploy:
>
>    1. The configuration syntax is really complex. I'm much more
>       comfortable with multiple simpler config files.
>
>    2. I'm not clear on how Paste Deploy's abstractions map to the
>       filesystem. What does my website root look like?
>
>
> With Aspen, I went with a well-defined filesystem layout (a
> Unix-style userland) and multiple configuration files (in etc/),
> each with their own simple syntax.
>
> So if you publish a blog app called SuperBlog, let's say, you
> would mount it in etc/apps.conf, e.g.:
>
>    / 	 myapp:root
>    /blog  superblog:main
>
> SuperBlog would configure itself with etc/superblog.conf, a file
> with a simple syntax described in your SuperBlog documentation.
> SuperBlog also has access to Aspen's global config through a
> simple API.
>
> I suggest that a system with multiple simple config files is much
> more scalable than a single complex config file syntax. Imagine
> if all of Unix were configured using a single syntax!
>
>
> Also, I don't think we should underestimate the importance of the
> file/executable distinction. A standard "file format" for a
> website enables a wider tool ecosystem to evolve: interactive
> shells, debuggers, test runners, skel systems, configuration UIs.
> It also makes any given website easier to comprehend and maintain.
>
>
> So in short, I give Paste Deploy a -1 as our main configuration
> system. I'd like the first-line config to be much simpler, with
> Paste Deploy available as an optional extra.
>
>
>
>
> chad
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/jim% 
> 40zope.com

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Sat Mar  3 16:42:10 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 3 Mar 2007 10:42:10 -0500
Subject: [Web-SIG] My summary of a web-platform Open-Space discussion at
	PyCon 2007
Message-ID: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com>


I'll summarize my recollections of a very useful discussion that  
several of us had at PyCon 2007.

At PyCon, Chad Whitacre gathered a a number of us for an Open Space  
discussion at PyCon to discuss how we might collaborate on common  
infrastructure at "below WSGI".  As I understood this, this included  
thing like:

- WSGI application assembly

- Main programs

- Process management tools

   - Daemon start, stop, status, etc.

   - Signal handling

   - Log rotation

   - Etc.

I managed to add:

- Server benchmarks

Maybe there were other things in scope that I forgot.

We should have appointed a secretary. :)

I think we decided on some immediate actions:

- Give Ian feedback on Paste Deploy

- Ian will lead a server benchmark effort

In addition, I think there is interest in coming up with best  
practices for daemon and Windows service management.  I don't think  
there were specific action items.  A few tools were mentioned.  (I'll  
send a separate brief note on my ideas about this).

My impression is that there isn't a lot of appetite for standardizing  
on a common pain application.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Sat Mar  3 17:08:24 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 3 Mar 2007 11:08:24 -0500
Subject: [Web-SIG] daemon tools
Message-ID: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>


For some time, Zope has used a daemon-management tool we wrote called  
zdaemon:

   http://www.python.org/pypi/zdaemon

Until late last year, I found this tool a bit difficult to use  
because it was essentially undocumented.  I was forced to learn  
enough to mostly document it and have gained a new appreciation of  
it. (I haven't documented its  interactive shell mode, which I don't  
use. Maybe someone will document it or maybe I'll just rip it out.)   
I considered making some enhancements to it and decided to ask if  
some folks knew about alternative tools we might use instead.  See  
the discussion at:

    http://mail.zope.org/pipermail/zope3-dev/2006-December/021353.html

Ironically, this sort of tool isn't Python specific at all, and the  
discussion highlighted some non-Python tools, notably daeomontools  
and runit, neither of which seemed as appealing as zdaemon for  
various reasons.  This discussion also noted a Python-based tool  
named suoervisor2:

   http://www.plope.com/software/supervisor2/

Which seems to be derived from zdaemon and has some interesting  
features.  I think that both zdaemon and supervisor3 do a better job  
of process management than daemontools or runit.

At the recent open-space discussion, another Python-based tool was  
mentioned whos name I don't remember.

I ended up deciding to use zdaemon for our projects because it met  
our needs very well.  I added a couple of enhancements:

- The ability to set environment variables.  This is really important  
to us as it allows us to set LD_LIBRARY_PATH. This wants to be done  
in a supervisor process. A Python program can't set LD_LIBRARY_PATH  
for itself because it is too late for it to be used by the library  
loaded.

- I finished the transcript log, making it rotatable. The zdaemon  
transcript log consumes the standard error and output of the program  
zdaemon manages, providing basic logging for applications that have  
lacking or lame logging support.  (zdaemon has allowed us to make the  
spread daemon far more manageable.)

Anyway, I share this for your consideration.  There are probably  
better tools out there than zdaemon and supervisor2, but I'm not  
aware of them. :)  I'm curious what other people have found or use.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From chad at zetaweb.com  Sat Mar  3 17:09:37 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Sat, 03 Mar 2007 11:09:37 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
Message-ID: <45E99DC1.4010703@zetaweb.com>

Jim,

Thanks for the reply.


 > 2. Do we want to reuse its configuration syntax.

-1


 > The configuration format used by Paste Deploy is a simple
 > standard format used by many many systems inside and outside
 > the Python community.

I'm not objecting to the general ini-style format (do I read you 
right?), but rather to the overloaded section names, the URI/name 
syntax, the 'set' prefix, composite applications, etc. Paste 
Deploy layers a whole mini-language on top of the ini format.


 > Obviously, we can agree to disagree on this.

Sure, as long as Paste Deploy's config syntax is optional for 
whatever-we're-building. :^)


 > 1. Can we agree on a standard set of entry points so that WSGI
 > applications can be combined automatically?  I think Paste
 > Deploy provides at least good start on this.
 >
 > You haven't commented on the entry points defined by Paste
 > Deploy.  Do you have an opinion on adopting the entry-point API
 > defined by Paste Deploy?

Ok, I need help: defining an entry point allows a plugin to 
advertise that it can satisfy that entry point, but you still 
need a configuration layer to actually wire it up, no? In which case:

   1) What does "automatically" mean?
   2) Aren't we back to discussing config syntax?


chad


From chad at zetaweb.com  Sat Mar  3 17:12:48 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Sat, 03 Mar 2007 11:12:48 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
Message-ID: <45E99E80.7050800@zetaweb.com>

> Anyway, I share this for your consideration.  There are probably  
> better tools out there than zdaemon and supervisor2, but I'm not  
> aware of them. :)  I'm curious what other people have found or use.

There's also monit:

   http://www.tildeslash.com/monit/


chad

From chad at zetaweb.com  Sat Mar  3 17:18:41 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Sat, 03 Mar 2007 11:18:41 -0500
Subject: [Web-SIG] My summary of a web-platform Open-Space discussion
 at	PyCon 2007
In-Reply-To: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com>
References: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com>
Message-ID: <45E99FE1.1090307@zetaweb.com>

Jim,

 > I'll summarize my recollections of a very useful discussion
 > that several of us had at PyCon 2007.

Looks accurate to me, thanks.


 > - Ian will lead a server benchmark effort

Where by "server," we mean core HTTP server library, yes?


 > My impression is that there isn't a lot of appetite for
 > standardizing on a common pain application.

Sorry, "pain application?" :^)

I assume you mean a common app server executable, as opposed to 
best practice docs, entry point standards, maybe even libraries, 
etc. Yes?


chad

From fumanchu at amor.org  Sat Mar  3 20:05:12 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Sat, 3 Mar 2007 11:05:12 -0800
Subject: [Web-SIG] more comments on Paste Deploy
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A86224D4C@ex9.hostedexchange.local>

Jim Fulton wrote:
> I believe, we're evaluating Paste Deploy at 2 levels:
> 1. Can we agree on a standard set of entry points so
> that WSGI applications can be combined automatically?
> I think Paste Deploy provides at least good start on this.

Yes, I think we can. And the ones in paste deploy are a good start (and end, for all I know). But if Ian's going to split Paste Deploy out into its own project (as he hinted), we should find a new namespace for them besides 'paste.*' soon.

> 2. Do we want to reuse it's configuration syntax.
> On the subject of configuration format, I suppose this
> is a matter of taste.  I strongly prefer having fewer
> configuration files, preferably one.

In my head, we share a 'site daemon' among us, and a common 'webctl' front end to that daemon should use a single INI-style config file (but like Chad, I'm not sold on Paste's existing format). However, we should build the site daemon in such a way that each framework can drive it in framework-specific way, and if they wanted to layer their own config style on top of that interface, fine. This would make it easier for the various framework authors and users to explore tutorials, run tests, and deploy single-framework sites.

In short, I'm pushing for:

  read conf -> apply conf -> del conf -> work with objects

as opposed to the much more tightly-coupled and hard-to-use:

  read conf -> work with a mix of conf and objects forever


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20070303/f911d6c2/attachment.html 

From lcrees at gmail.com  Sat Mar  3 20:21:53 2007
From: lcrees at gmail.com (L.C. Rees)
Date: Sat, 3 Mar 2007 12:21:53 -0700
Subject: [Web-SIG] more comments on Paste Deploy
Message-ID: <3ce244090703031121h578bdaa2g8a7a1375b45c60e1@mail.gmail.com>

> Sure, as long as Paste Deploy's config syntax is optional for
> whatever-we're-building. :^)

Some of the pain and angst over choosing one solution to the WSGI
application composition problem could be treated by dividing the
composition process into (at least) three parts:

1. Configuration parsing

Configuration information is read from multiple files or one big file
all at once (something ConfigParser in the standard library, for
example, already has support for) or selectively.  The information,
stored in whatever format (INI, Python, even XML, pick your poison),
is parsed (with  optionally validation) into a uniform internal Python
format. The internal format would be a sequence of tuples. Each tuple
would contain three elements:

a. An identifier consisting of tuple that contains two elements, an
(optional) qualifying prefix and a more specific identifier.
b. Configuration parameters that have been parsed into a tuple of
positional arguments.
c. Configuration parameters that have been parsed into a dictionary of
keyword arguments.

2. Dispatching

A dispatcher would take the sequence of tuples from the parser and
resolve the identifier to an adapter. The dispatcher would then strip
out the identifier, and pass a tuple containing the tuple of
positional arguments the dictionary of keyword arguments to the
adapter.

Different identifier schemes could be accommodated by the same
dispatcher as needed.

3. Adapting

The adapter would be responsible for taking the configuration data in
the tuple passed to it by the dispatcher and returning a configured
WSGI application.

An approach that decomposes the WSGI application composition process
into distinct stages would accommodate different approaches to each
stage of the composition process while allowing interoperability
similar to how WSGI allows heterogeneous Python web applications to
live together in (greater) peace and harmony-lcr

From ianb at colorstudy.com  Sat Mar  3 21:39:37 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 14:39:37 -0600
Subject: [Web-SIG] daemon tools
In-Reply-To: <45E99E80.7050800@zetaweb.com>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
	<45E99E80.7050800@zetaweb.com>
Message-ID: <45E9DD09.8030605@colorstudy.com>

Chad Whitacre wrote:
>> Anyway, I share this for your consideration.  There are probably  
>> better tools out there than zdaemon and supervisor2, but I'm not  
>> aware of them. :)  I'm curious what other people have found or use.
> 
> There's also monit:
> 
>    http://www.tildeslash.com/monit/

I think monit overlaps some with supervisor2's featureset, but not as 
much with zdaemon.  Having monit poll your process to check it's alive 
isn't as solid a solution as having a real parent process to do that. 
Monit would still be useful with zdaemon, because it can poll things 
like HTTP responses, memory usage, etc.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Sat Mar  3 21:40:59 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 14:40:59 -0600
Subject: [Web-SIG] PasteDeploy comments
In-Reply-To: <EB6BBB04-FE68-4B8B-8861-8C4AD99133B1@zope.com>
References: <DF790971-360D-49B7-A62E-6E42C36C595F@zope.com>
	<45E8CCAD.2050008@colorstudy.com>
	<EB6BBB04-FE68-4B8B-8861-8C4AD99133B1@zope.com>
Message-ID: <45E9DD5B.2070900@colorstudy.com>

Jim Fulton wrote:
> 
> On Mar 2, 2007, at 8:17 PM, Ian Bicking wrote:
> 
>> Jim Fulton wrote:
>>>   What have you used global configuration data for?
>>
>> It's often meant for configuration that applies to many components.  
>> For instance, a "debug" value that applies widely (or could also be 
>> applied locally).  Or information about where to email errors, some 
>> logging information, etc.  E.g., you might give a base directory for 
>> logging in global_conf, and an application could pick that up and 
>> probably put it in a subdirectory there (where if you configured it 
>> locally, you'd probably give the application the full path of the log 
>> file).
> 
> I know what it's meant for.  I was asking what it was actually *used* 
> for.  Is this truly useful?

Well, for some things like the debug setting, definitely.  That is, 
*some* applications consume that value, but not all, and in the form of 
global_conf the value just sort of hangs out without being applied to 
anything in particular.  In deployments where I'm using a set of 
applications designed to work together I've found it useful to pass 
values to all of the applications at once.  Also when you pass values in 
through the command-line it gets put into global_conf, because it's not 
clear what section it would otherwise apply to (since the application 
you are intending to effect may be wrapped by middleware).

>>   Also, it seems like there should be a better way to shut it down 
>> than killing the entire process.  For instance, for performance testing.
> 
> <shrug>  This doesn't seem important to me.

Really what I'd like it for is testing, in those times when I really 
want to start up a real HTTP server to test against, then cleanly shut 
it down.

>>> Overall, PasteDeploy looks very usable.  I'll probably find other 
>>> issues when I actually try to use it.  One of my next projects wil be 
>>> to look at how to use it in Zope.  zope.paste is a bit too much of a 
>>> wedge.
>>
>> zope.paste, as I remembered it, didn't really seem to allow things 
>> like instantiating multiple Zope applications.  But I can't remember.  
>> And that's not always feasible; Zope 2 is unlikely to really support 
>> many truly separate instantiated applications, but it could still 
>> support the basic configuration.
> 
> zope.paste tries very hard to minimize its impact on zope 
> configuration.  It has to make a number of compromises to do this.  It 
> is impossible to run "truly separate" Python applications in the same 
> process, for some definition of "truly separate" and "application".  
> separate WSGI applications will share common module definitions and 
> shared module globals.  I can easily imagine separate Zope (2 & 3) 
> applications that exposed separate object spaces or sets of procedural 
> (as opposed to object-based) pages.

"Separate" instances of applications is a fairly vague notion, that only 
means something when applied specifically.  I would hope that you could 
start two Zope apps pointing at different ZODB instances, just like you 
should be able to start two apps pointing to different objects in the 
same ZODB.

>>> On a related note, I'll probably want to do process configuration in 
>>> the same file that that PasteDeploy uses. This would likely include 
>>> things like:
>>> - interrupt-check-interval
>>> - Log files
>>> I guess there is nothing to prevent this.  I suspect that I'll also 
>>> get a lot of resistence to moving this out of zope.conf. :/
>>
>> Yes, the container configuration.  (Incidentally, what exactly do we 
>> call this thing we're proposing to make?)
> 
> I'm not sure we're initially proposing to make *a* thing. For starters I 
> think we're exploring using the PasteDeploy-defined frameworks and to 
> collaborate on sever testing.
> 
> I would call this the main program, but maybe other terms would be better.
> 
>>> Have you tried pointing logging.fileConfig at a cnfig file containing 
>>> PasteDeplot sections?  I assume it would work.
>>
>> I haven't tried it, but I think Ben Bangert has started work on that, 
>> using global_conf['__file__'] that way.  A more cohesive logging story 
>> that included that would be nice.
> 
> I think this should be done by the main program (container/whatever) not 
> by an application.

In the case of Paste and Pylons, we wanted to add a bunch of logging to 
the library.  The library at that point doesn't belong to any 
application.  Having a bunch of logging without a clear story about how 
to use that logging seemed bad (in this case it's mostly logging 
intended for programmers, not final deployment, but some portions could 
be useful in final deployment).

It could (and probably would) be applied as an outer middleware applied 
by individual applications, but ideally there would be shared 
conventions across frameworks.  Ideally it would also make 
application-specific logging easier.  I think logging configuration is a 
general use case we should consider, but I don't think it's part of the 
container really.  It might relate to something in Paste Deploy 
configuration.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Sat Mar  3 21:44:57 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 14:44:57 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
Message-ID: <45E9DE49.1010801@colorstudy.com>

Jim Fulton wrote:
> I'll respond in a high-level way.
> 
> I believe, we're evaluating Paste Deploy at 2 levels:
> 
> 1. Can we agree on a standard set of entry points so that WSGI  
> applications can be combined automatically?  I think Paste Deploy  
> provides at least good start on this.
> 
> 2. Do we want to reuse it's configuration syntax.

Yes, I hope people will look at these separately.  The entry points 
provide a consistent way to get at middleware and applications.  I've 
been careful to not expose the actual configuration file to 
applications, and I like that.  It makes it possible to discuss these 
separately.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Sat Mar  3 21:54:41 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 14:54:41 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E8EB97.6090805@zetaweb.com>
References: <45E8EB97.6090805@zetaweb.com>
Message-ID: <45E9E091.3070603@colorstudy.com>

Chad Whitacre wrote:
> All,
> 
> Thanks, Jim and Ian, for bringing this discussion online.
> 
> I have two hesitations with Paste Deploy:
> 
>    1. The configuration syntax is really complex. I'm much more
>       comfortable with multiple simpler config files.

Is it really that complex?  There's a few too many ways to do middleware 
around applications, I'm afraid.  get/set is really a rather obscure 
feature that I seldom use.  The distinction between "composite" and 
"app" isn't necessary, I think.

The ability to inherit from sections is really useful IMHO (though not 
well described in documentation); that's where you do something like 
"use = other_section", and then add settings that override that other 
section's settings.

>    2. I'm not clear on how Paste Deploy's abstractions map to the
>       filesystem. What does my website root look like?
> 
> 
> With Aspen, I went with a well-defined filesystem layout (a 
> Unix-style userland) and multiple configuration files (in etc/), 
> each with their own simple syntax.
> 
> So if you publish a blog app called SuperBlog, let's say, you 
> would mount it in etc/apps.conf, e.g.:
> 
>    / 	 myapp:root
>    /blog  superblog:main
> 
> SuperBlog would configure itself with etc/superblog.conf, a file 
> with a simple syntax described in your SuperBlog documentation. 
> SuperBlog also has access to Aspen's global config through a 
> simple API.

The way I have generally configured websites like this is like:

   [composite:main]
   use = egg:Paste#urlmap
   / = config:root.ini
   /blog = config:superblog.ini

Then I put root.ini and superblog.ini alongside this configuration file, 
and each has an [app:main] section.  (You can also point to another 
section in a file, like config:root.ini#other_section)

> I suggest that a system with multiple simple config files is much 
> more scalable than a single complex config file syntax. Imagine 
> if all of Unix were configured using a single syntax!

I think it depends some on the particular case.  Paste Deploy lets you 
do both.  For instance, in one case we made a really simple application 
that just returned a random bit of HTML selected from a specific file 
full of HTML snippets (used with SSIs).  The basic config looked like:

   [app:random]
   use = egg:Randomizer
   file = /path/to/file.html

Except we had about 5 of these, and we put them all in one file and then 
mounted them like:

   [composite:main]
   use = egg:Paste#urlmap
   /random1 = config:random.ini#random1
   /random2 = config:random.ini#random2 ...

There's other cases where having both options is nice.  Because Paste 
Deploy doesn't fold config files together, you can also reuse them from 
different contexts.  (A more common way to use multiple config files -- 
what ConfigParser.load supports -- is to just overlap all the sections, 
usually totally clobbering each other.  I like this more explicit way of 
bringing in configuration, which treats configuration like a composable 
set of configurations instead of a system where all the configuration 
files are pretty tightly bound to each other.)

> Also, I don't think we should underestimate the importance of the 
> file/executable distinction. A standard "file format" for a 
> website enables a wider tool ecosystem to evolve: interactive 
> shells, debuggers, test runners, skel systems, configuration UIs. 
> It also makes any given website easier to comprehend and maintain.

I'm not sure about the distinction you are making here.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Sat Mar  3 22:06:07 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 15:06:07 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <435DF58A933BA74397B42CDEB8145A86224D4C@ex9.hostedexchange.local>
References: <45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<435DF58A933BA74397B42CDEB8145A86224D4C@ex9.hostedexchange.local>
Message-ID: <45E9E33F.7050604@colorstudy.com>

Robert Brewer wrote:
> Jim Fulton wrote:
>  > I believe, we're evaluating Paste Deploy at 2 levels:
>  > 1. Can we agree on a standard set of entry points so
>  > that WSGI applications can be combined automatically?
>  > I think Paste Deploy provides at least good start on this.
> 
> Yes, I think we can. And the ones in paste deploy are a good start (and 
> end, for all I know). But if Ian's going to split Paste Deploy out into 
> its own project (as he hinted), we should find a new namespace for them 
> besides 'paste.*' soon.

Well, only if we use the entry points ;).  Paste Deploy already supports 
a couple overlapping entry points.  It could support more, or a new 
system could support those plus some more (I assume even if we implement 
a new config file format here, I'll add support to Paste Deploy as well 
for people who don't switch over immediately).

I don't think we should add any new names or prefixes until we've 
solidly settled on what those entry points define.  If we want to rename 
the Paste Deploy entry point groups at that point, that's fine.

>  > 2. Do we want to reuse it's configuration syntax.
>  > On the subject of configuration format, I suppose this
>  > is a matter of taste.  I strongly prefer having fewer
>  > configuration files, preferably one.
> 
> In my head, we share a 'site daemon' among us, and a common 'webctl' 
> front end to that daemon should use a single INI-style config file (but 
> like Chad, I'm not sold on Paste's existing format). However, we should 
> build the site daemon in such a way that each framework can drive it in 
> framework-specific way, and if they wanted to layer their own config 
> style on top of that interface, fine. This would make it easier for the 
> various framework authors and users to explore tutorials, run tests, and 
> deploy single-framework sites.
> 
> In short, I'm pushing for:
> 
>   read conf -> apply conf -> del conf -> work with objects
> 
> as opposed to the much more tightly-coupled and hard-to-use:
> 
>   read conf -> work with a mix of conf and objects forever

I definitely agree that we shouldn't pass big config objects to 
applications (or servers or middleware or whatever).  I don't really 
like that global_conf['__file__'] gives you the filename; it's a little 
vague what it really means when you are nesting several files, and it 
can encourage hacky things.  OTOH, if you want to fold your logging conf 
in with your app conf, it provides a reasonably easy way to do that I 
suppose.  Anyway, besides one or two ways you can poke through, Paste 
Deploy mostly does this.

Incidentally, one thing Paste Deploy doesn't really allow well is when 
you have really complicated configuration.  For instance, an application 
like Trac has a big config file with lots of sections.  One could argue 
that it's *too* big, but it is what it is.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Sat Mar  3 22:37:52 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 15:37:52 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <3ce244090703031121h578bdaa2g8a7a1375b45c60e1@mail.gmail.com>
References: <3ce244090703031121h578bdaa2g8a7a1375b45c60e1@mail.gmail.com>
Message-ID: <45E9EAB0.30201@colorstudy.com>

L.C. Rees wrote:
>> Sure, as long as Paste Deploy's config syntax is optional for
>> whatever-we're-building. :^)
> 
> Some of the pain and angst over choosing one solution to the WSGI
> application composition problem could be treated by dividing the
> composition process into (at least) three parts:
> 
> 1. Configuration parsing
> 
> Configuration information is read from multiple files or one big file
> all at once (something ConfigParser in the standard library, for
> example, already has support for) or selectively.  The information,
> stored in whatever format (INI, Python, even XML, pick your poison),
> is parsed (with  optionally validation) into a uniform internal Python
> format. 

I don't think we should have any validation in the config format (except 
for basic syntax, of course).  Doing validation is just too hard, and 
leads to a rather complex config framework.  I think some of the 
problems with ZConfig come back to this.

I personally am quite happy with Paste Deploy using straight strings, 
not Python expressions or anything else that presumes to understand values.

> The internal format would be a sequence of tuples. Each tuple
> would contain three elements:
> 
> a. An identifier consisting of tuple that contains two elements, an
> (optional) qualifying prefix and a more specific identifier.
> b. Configuration parameters that have been parsed into a tuple of
> positional arguments.
> c. Configuration parameters that have been parsed into a dictionary of
> keyword arguments.

I'm confused here.  Can you give an example of what this data would look 
like for something simple?  (E.g., a blog app)  How does this different 
or better than a flat dictionary of strings (which is basically what 
Paste Deploy provides)?

> 2. Dispatching
> 
> A dispatcher would take the sequence of tuples from the parser and
> resolve the identifier to an adapter. The dispatcher would then strip
> out the identifier, and pass a tuple containing the tuple of
> positional arguments the dictionary of keyword arguments to the
> adapter.
> 
> Different identifier schemes could be accommodated by the same
> dispatcher as needed.

I'm not sure what you are describing here.  Is this like in Paste 
Deploy, we strip out the "use" key to find the entry point?

> 3. Adapting
> 
> The adapter would be responsible for taking the configuration data in
> the tuple passed to it by the dispatcher and returning a configured
> WSGI application.
> 
> An approach that decomposes the WSGI application composition process
> into distinct stages would accommodate different approaches to each
> stage of the composition process while allowing interoperability
> similar to how WSGI allows heterogeneous Python web applications to
> live together in (greater) peace and harmony-lcr

In some ways we can, in some ways we can't.  For instance, a config file 
format that produces integers, lists, etc., is a bit hard to reconcile 
with a separate format that only produces strings.  (If consumers always 
special-case strings this isn't so bad, but if you get used to getting 
non-strings you are less likely to do that.)  Also, is order relevant? 
It isn't in dictionaries, but could be in a file format, but probably 
wouldn't be in a database.  We have to come up with some lowest common 
denominator.  And having done that, we can support *some* set of config 
formats or data sources, but a bunch of formats will seem superfluous, 
as any added value they might provide will be useless since it can't be 
relied upon.

In this sense, while the entry points can be mostly discussed regardless 
of the config format, it's not entirely true -- you have to keep at 
least some set of config formats in your head at the same time as you 
are discussing the entry points.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Sat Mar  3 22:48:56 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 15:48:56 -0600
Subject: [Web-SIG] PasteDeploy comments
In-Reply-To: <EB6BBB04-FE68-4B8B-8861-8C4AD99133B1@zope.com>
References: <DF790971-360D-49B7-A62E-6E42C36C595F@zope.com>
	<45E8CCAD.2050008@colorstudy.com>
	<EB6BBB04-FE68-4B8B-8861-8C4AD99133B1@zope.com>
Message-ID: <45E9ED48.3070905@colorstudy.com>

Jim Fulton wrote:
> 
> On Mar 2, 2007, at 8:17 PM, Ian Bicking wrote:
> 
>> Jim Fulton wrote:
>>>   What have you used global configuration data for?
>>
>> It's often meant for configuration that applies to many components.  
>> For instance, a "debug" value that applies widely (or could also be 
>> applied locally).  Or information about where to email errors, some 
>> logging information, etc.  E.g., you might give a base directory for 
>> logging in global_conf, and an application could pick that up and 
>> probably put it in a subdirectory there (where if you configured it 
>> locally, you'd probably give the application the full path of the log 
>> file).
> 
> I know what it's meant for.  I was asking what it was actually *used* 
> for.  Is this truly useful?

An example that would probably apply to Zope: you have several Zope 
apps, but they aren't at the "top" of the website.  That is, there's 
some dispatchers and middleware before you get to them.  If you want 
them all to use some common configuration -- stuff like the location of 
the ZODB -- you might set those values globally, and if the applications 
specifically picked those up (which I would expect) then that would be 
convenient.

Some configuration values don't make any sense to set globally, and 
applications can require local settings (or require that there are no 
extra local settings), so I think the distinction is nice.  I initially 
planned to just fold all the configuration into one set of keywords, but 
Phillip talked me out of it.  It would mean that every application would 
have to take a bunch of keyword arguments they would ignore (since there 
might be global settings that didn't apply to them), and they could 
unintentionally pick up global arguments that only coincidentally 
matched local settings.

Not having *any* global settings would be doable.  You'd have to use a 
lot more of the "get" option that Paste Deploy uses, or maybe if it had 
an option to draw in the settings from another section (e.g., you'd set 
up one zodb section and draw in from it in all your apps).  You'd have 
to know where those settings applied, you wouldn't coincidentally get 
those values, nor would you be as likely to give good site-wide defaults 
where general defaults were acceptable.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From fumanchu at amor.org  Sun Mar  4 00:19:15 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Sat, 3 Mar 2007 15:19:15 -0800
Subject: [Web-SIG] daemon tools
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A86224D4D@ex9.hostedexchange.local>

Jim Fulton wrote:
> For some time, Zope has used a daemon-management tool
> we wrote called zdaemon:
> 
>    http://www.python.org/pypi/zdaemon
> 
> Ironically, this sort of tool isn't Python specific at all,
> and the discussion highlighted some non-Python tools, notably
> daemontools and runit, neither of which seemed as appealing
> as zdaemon for various reasons.

The user interface isn't Python-specific, but the interaction with WSGI servers, middleware, applications, and frameworks should be. Components at all levels of the WSGI stack need to interact with "site-wide" events and settings. What I'm envisioning (and writing for CP at the moment) is a framework-neutral, one-per-site Engine object that is basically a publish/subscribe messenger; when you import a Python web framework, it registers listeners for process start, stop, and graceful restart. These would be things that need to happen regardless of the OS process invoker: whether a common 'webctl' script (that we author), or a framework-specific function (like cherrypy.quickstart), or Apache (via mod_python).

The pub/sub model also supports plugins with their own channel(s). For example, frameworks would blindly call engine.publish('autoreload.add', filename) as desired. If the invoker (webctl, quickstart, or Apache) plugs in an autoreloader, great; it subscribes to that channel, receives each message, and adds each filename to its list of files to monitor. If no autoreloader has been plugged in, the 'add' message is correctly ignored. And when the autoreloader detects a change, it would also publish 'reload' or 'reexec' messages, which would then be subscribed to by a Reexec plugin. Most of the plugins would be provided by the invoker, but frameworks would be free to use the Engine to register their own events and event listeners.

This interface between a site-wide container and the WSGI components is far more important to me than the actual details of invocation (like forking, signal-handling, logging, etc). The latter can be written as Engine plugins, and can compete in a market created by a good "Web Site Engine Interface" spec.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20070303/be2a01b8/attachment.html 

From reinoutvanrees at gmail.com  Sun Mar  4 00:37:22 2007
From: reinoutvanrees at gmail.com (Reinout van Rees)
Date: Sat, 03 Mar 2007 23:37:22 -0000
Subject: [Web-SIG] python.org mailing list memberships reminder
In-Reply-To: <mailman.12154.1172721883.2611.mailman@python.org>
References: <mailman.12154.1172721883.2611.mailman@python.org>
Message-ID: <1172965042.429886.67700@v33g2000cwv.googlegroups.com>

On Mar 1, 5:04 am, mailman-ow... at python.org wrote:
> This is a reminder, sent out once a month, about your python.org
> mailing list memberships.  It includes your subscription info and how
> to use it to change it or unsubscribe from a list.

Just before someone starts messing around with google groups'
mailinglist subscription: I just switched off the monthly password
reminder by using the password transmitted in that way :-)

Reinout


From chad at zetaweb.com  Sun Mar  4 01:27:30 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Sat, 3 Mar 2007 19:27:30 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E9E091.3070603@colorstudy.com>
References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com>
Message-ID: <f593a5ce0703031627m4d78dfd0te52131f717e0549f@mail.gmail.com>

Ian,

Thanks for weighing in.


> >    2. I'm not clear on how Paste Deploy's abstractions map to the
> >       filesystem. What does my website root look like?
>
> The way I have generally configured websites like this is like:
>
>    [composite:main]
>    use = egg:Paste#urlmap
>    / = config:root.ini
>    /blog = config:superblog.ini

Right, that's the configuration, but where is "egg:Paste#urlmap" on
the filesystem? Are the three ini files alone in some directory? Where
is paste? Where is SuperBlog? Where is the rest of the site? I find it
easier to start with the filesystem and then move up into
object/config abstractions.


> > Also, I don't think we should underestimate the importance of the
> > file/executable distinction. A standard "file format" for a
> > website enables a wider tool ecosystem to evolve: interactive
> > shells, debuggers, test runners, skel systems, configuration UIs.
> > It also makes any given website easier to comprehend and maintain.
>
> I'm not sure about the distinction you are making here.

ODT vs. DOC
ODS vs. XLS
ODP vs. PPT

From ianb at colorstudy.com  Sun Mar  4 01:54:19 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 18:54:19 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <f593a5ce0703031627m4d78dfd0te52131f717e0549f@mail.gmail.com>
References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com>
	<f593a5ce0703031627m4d78dfd0te52131f717e0549f@mail.gmail.com>
Message-ID: <45EA18BB.6030703@colorstudy.com>

Chad Whitacre wrote:
>> >    2. I'm not clear on how Paste Deploy's abstractions map to the
>> >       filesystem. What does my website root look like?
>>
>> The way I have generally configured websites like this is like:
>>
>>    [composite:main]
>>    use = egg:Paste#urlmap
>>    / = config:root.ini
>>    /blog = config:superblog.ini
> 
> Right, that's the configuration, but where is "egg:Paste#urlmap" on
> the filesystem? Are the three ini files alone in some directory? Where
> is paste? Where is SuperBlog? Where is the rest of the site? I find it
> easier to start with the filesystem and then move up into
> object/config abstractions.

You just have to understand what egg:Paste#urlmap is, probably from some 
documentation.  Admittedly that's boilerplate in the eyes of most people 
who use it.  It's there explicitly because Paste Deploy doesn't build 
*any* WSGI anything into it, it only composes pieces, one of the most 
common being urlmap.  You can see docs for it with "paster points 
paste.composite_factory urlmap", though I now notice I haven't written 
any docs for it (bad of me), and that is hardly a simple command line. 
I would certainly want to build a command-line help/browser (and 
probably web one too) as part of a rewrite of the system.

The three ini files do go in the same directory, though of course you 
could do config:superblog/app.ini or something like that if you wanted 
to set it up differently.  It's a relative filename, relative to the 
file where it is given.

The applications themselves are eggs.  You install them however you want 
to install them (of course I'd strongly recommend workingenv, 
virtual-python, or zc.buildout, but that's a separate concern).  Some 
people have mentioned some frustration about having to build full 
libraries with a namespace, setup.py, eggs, etc. just to use 
applications.  But I think even pretty modest shops writing very one-off 
apps gain a real benefit from these patterns, once you get over the 
initial hump (and we can build tools to make the initial hump not so 
bad, that's the point of paster create).  Anyway, here's one reply I 
made to that request: 
http://pythonpaste.org/archives/message/20070215.192041.1534ce27.en.html

There's a lot of practices around library management that *has* to be 
done, because people use libraries.  Most of this applies pretty well to 
applications as well -- and since everyone *needs* to learn how to 
manage their libraries, using the same mechanisms for managing 
applications has some advantage.

Incidentally, one change to the config format that would make it 
possible to remove the explicit idea of "composite" apps, is to make 
some key syntax that will instantiate the named object.  E.g.,:

   app / = config:root.ini

Then the keywords passed would just be {"/": <actual WSGI app>}, instead 
of the current {"/": "config:root.ini"} (where the "config:root.ini" is 
passed to the loader object that the composite factory gets).

>> > Also, I don't think we should underestimate the importance of the
>> > file/executable distinction. A standard "file format" for a
>> > website enables a wider tool ecosystem to evolve: interactive
>> > shells, debuggers, test runners, skel systems, configuration UIs.
>> > It also makes any given website easier to comprehend and maintain.
>>
>> I'm not sure about the distinction you are making here.
> 
> ODT vs. DOC
> ODS vs. XLS
> ODP vs. PPT

Still unclear.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From lcrees at gmail.com  Sun Mar  4 02:06:51 2007
From: lcrees at gmail.com (Lynn Rees)
Date: Sat, 03 Mar 2007 18:06:51 -0700
Subject: [Web-SIG] more comments on Paste Deploy
Message-ID: <45EA1BAB.8090801@gmail.com>

 > I don't think we should have any validation in the config format
 > (except for basic syntax, of course).  Doing validation is just too
 > hard, and leads to a rather complex config framework.  I think some
 > of the problems with ZConfig come back to this.

I didn't propose that validation be in the config format. I proposed 
that the configuration parser, whatever config format it's parsing, pass 
the configuration data it extracts on to the next stage of the WSGI 
composition process in a standard format. The parser may or may not 
validate configuration data before it passes it on; decomposing WSGI 
composition into distinct and modular stages means that the rest of the 
composition process doesn't have to care.

 > I personally am quite happy with Paste Deploy using straight strings, 
 > not Python expressions or anything else that presumes to understand
 > values.

I don't disagree but whether to use strings or not is an implementation 
issue (Paste Deploy) and not a process issue (WSGI application 
composition). My proposal addressed the process, not the particular 
implementation.

 > I'm confused here.  Can you give an example of what this data would
 > look like for something simple?  (E.g., a blog app)  How does this
 > different or better than a flat dictionary of strings (which is
 > basically what Paste Deploy provides)?

The message passing format is based on the following premise: 
ultimately, any configuration of a WSGI component involves 1) locating a 
Python routine and 2) passing some combination of arguments and/or 
keywords to it. The format:

(identifier, (args), {kwargs})

contains sufficient information to 1) identify a Python routine and 2) 
pass configuration data to it in a format it's hardwired to handle. 
Whether a collection of configuration directives for a group of WSGI 
components is passed on to the next stage of the composition process as 
a tuple or a dictionary e.g.

{identifier1:((args), {kwargs}, identifier2:((args), {kwargs})}

is a matter of complete indifference to me.

 > I'm not sure what you are describing here.  Is this like in Paste
 > Deploy, we strip out the "use" key to find the entry point?

The use of "use" and the concept of a distinct dispatching stage are 
complementary. The dispatcher accesses a map of identifiers to adapters, 
fetches the adapter matching an identifier, and passes configuration 
data to it. The identifier could be the value specified by the "use" 
key. That's an implementation decision.

 > In some ways we can, in some ways we can't.  For instance, a config
 > file format that produces integers, lists, etc., is a bit hard to
 > reconcile with a separate format that only produces strings.  (If
 > consumers always special-case strings this isn't so bad, but if you
 > get used to getting non-strings you are less likely to do that.)

What's passed in the args tuple and kwargs dictionary is the internal 
business of either the configuration parser that kicks the process off 
and the adapter that receives it at the end. From the point of view of 
passing the data between stages in the composition process, the type of 
the container is the only type that matters. Most Python containers are 
type agnostic and I think that's a good principle to remain faithful to 
in an interop format.

 > Also, is order relevant? It isn't in dictionaries, but could be in a
 > file format, but probably wouldn't be in a database.  We have to come
 > up with some lowest common denominator.  And having done that, we can
 > support *some* set of config formats or data sources, but a bunch of
 > formats will seem superfluous, as any added value they might provide
 > will be useless since it can't be relied upon.

Since I compose WSGI components by sequentially wrapping one component 
within another, a sequence is the most natural way to pass WSGI 
configuration around to me. However, dictionaries are fine with me.

I wouldn't necessarily enforce order in a config format per se. However, 
the point of breaking the composition process into distinct phases is so 
that I can use whatever config file format I wish and know that the WSGI 
component I'm configuring will receive the configuration data-lcr

From ianb at colorstudy.com  Sun Mar  4 02:07:55 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 03 Mar 2007 19:07:55 -0600
Subject: [Web-SIG] My summary of a web-platform Open-Space discussion
 at	PyCon 2007
In-Reply-To: <45E99FE1.1090307@zetaweb.com>
References: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com>
	<45E99FE1.1090307@zetaweb.com>
Message-ID: <45EA1BEB.6000104@colorstudy.com>

Chad Whitacre wrote:
> Jim,
> 
>  > I'll summarize my recollections of a very useful discussion
>  > that several of us had at PyCon 2007.
> 
> Looks accurate to me, thanks.
> 
> 
>  > - Ian will lead a server benchmark effort
> 
> Where by "server," we mean core HTTP server library, yes?

Yes, cherrypy.wsgiserver, paste.httpserver, twisted.web2, flup, etc.

At openplans we (well, Luke) did some performance testing, in our case 
of an intermediary we're writing.  The same basic pattern should fit 
this.  I wrote a couple WSGI apps for that that showed particular kinds 
of behavior.  I guess all I really did was an application that 
periodically was really slow.  Another interesting case would be an 
application that yielded content very slowly.  Different combinations of 
app_iter and start_response writer could be interesting.  And of course 
the simplest example (which is usually all people do) of a trivial 
application that just serves up a single short string.  Oh, and I should 
do one that serves up a large string, in one chunk and many chunks.

Personally my own interest is in servers that act well even when the 
apps act poorly, more than the single case of a fast server with a 
perfect and fast application behind it.  The perfect app is easy to 
test, so it'll be in there too of course, but just one of many.

I think most of the work will be in setting up httperf with some scripts 
to invoke it and the other server.  The other stuff can already be glued 
together quite easily by Paste Deploy.  Well, not counting some of the 
servers that are harder to put together, like Apache+flup/fastcgi (or 
another server there), or mod_python generally.  I suppose I'll just 
write up some simple httpd.conf's for these cases, and I guess I can 
fire it off from a script easily enough.  Well, I'll probably look to 
someone else to do mod_python (and mod_wsgi before long), since I'm bad 
at setting those up.  Once Apache+flup is setup, Apache+mod_python would 
probably be easy for someone who knows there way around.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From chad at zetaweb.com  Sun Mar  4 05:27:29 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Sat, 3 Mar 2007 23:27:29 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
Message-ID: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>

> >> >    2. I'm not clear on how Paste Deploy's abstractions map to the
> >> >       filesystem. What does my website root look like?
> >>
> >> The way I have generally configured websites like this is like:
> >>
> >>    [composite:main]
> >>    use = egg:Paste#urlmap
> >>    / = config:root.ini
> >>    /blog = config:superblog.ini
> >
> > Right, that's the configuration, but where is "egg:Paste#urlmap" on
> > the filesystem? Are the three ini files alone in some directory? Where
> > is paste? Where is SuperBlog? Where is the rest of the site? I find it
> > easier to start with the filesystem and then move up into
> > object/config abstractions.
>
> You just have to understand what egg:Paste#urlmap is, probably from some
> documentation.  Admittedly that's boilerplate in the eyes of most people
> who use it.  It's there explicitly because Paste Deploy doesn't build
> *any* WSGI anything into it, it only composes pieces, one of the most
> common being urlmap.  You can see docs for it with "paster points
> paste.composite_factory urlmap", though I now notice I haven't written
> any docs for it (bad of me), and that is hardly a simple command line.
> I would certainly want to build a command-line help/browser (and
> probably web one too) as part of a rewrite of the system.
>
> The three ini files do go in the same directory, though of course you
> could do config:superblog/app.ini or something like that if you wanted
> to set it up differently.  It's a relative filename, relative to the
> file where it is given.
>
> The applications themselves are eggs.  You install them however you want
> to install them (of course I'd strongly recommend workingenv,
> virtual-python, or zc.buildout, but that's a separate concern).  Some
> people have mentioned some frustration about having to build full
> libraries with a namespace, setup.py, eggs, etc. just to use
> applications.  But I think even pretty modest shops writing very one-off
> apps gain a real benefit from these patterns, once you get over the
> initial hump (and we can build tools to make the initial hump not so
> bad, that's the point of paster create).  Anyway, here's one reply I
> made to that request:
> http://pythonpaste.org/archives/message/20070215.192041.1534ce27.en.html
>
> There's a lot of practices around library management that *has* to be
> done, because people use libraries.  Most of this applies pretty well to
> applications as well -- and since everyone *needs* to learn how to
> manage their libraries, using the same mechanisms for managing
> applications has some advantage.
>
> Incidentally, one change to the config format that would make it
> possible to remove the explicit idea of "composite" apps, is to make
> some key syntax that will instantiate the named object.  E.g.,:
>
>    app / = config:root.ini
>
> Then the keywords passed would just be {"/": <actual WSGI app>}, instead
> of the current {"/": "config:root.ini"} (where the "config:root.ini" is
> passed to the loader object that the composite factory gets).

Dude, my eyes are seriously glazing over. I want you to say something
simple, like:

  $ cd /usr/local/www
  $ workingenv.py example.com
  ...
  $ cd example.com
  $ source bin/activate
  (example.com)$ mkdir etc

  Then stick a config file in etc/ and run a simple command to start
your website.

That's the kind of thing I imagine you doing (eh?), and it's also the
thing that Aspen does. The difference is mostly in the config files.

Now, Jim: it looks like Zope still uses a Unix-y userland for
INSTANCE_HOME, yes? So that's Paste, Pylons(?), Aspen, Zope2 and Zope3
all using the same filesystem layout. IINM the filesystem structures
of Django and CP/TurboGears are module-level (Bob?), so they could
easily fit into lib/python.

If we could agree on a really simple first-line config file that
handles basic process configuration--address, user/group, threads,
etc.--and then points to the next layer config--be it zope.conf,
paste.ini, apps.conf, or settings.py--then we'd be pretty far towards
a common app server.

That is to say, I think we are really discussing three increasing
levels of cooperation:

  1) Server benchmarks and inter-op standards (Jim)
  2) Common process management library (Bob)
  3) Common web app server

Without discouraging the first two efforts, I'd like to champion the
third. Here would be my proposal:

First, we define a "website" on the filesystem as a Unix-y userland
with, at minimum, the following:

  etc/<foo>.conf
  lib/python

Second, we adopt a simple ini-style format for <foo>.conf, which
handles low-level process config. This file would then point to a
second, framework-specific configuration layer.

I suggest that this isn't too far from where we each are now, nor from
where our discussion has already led. It fits long-established
patterns (etc, ini), and doesn't preclude cooperation on benchmarks,
inter-op, or libraries. Furthermore, collaborating here would spread
around what amounts to grunt work, and give Python web deployment a
simple and compelling story, while in no way crippling more advanced
use cases.

Are you guys interested in this proposal? If so, I can write it up in
more detail.


chad

From fumanchu at amor.org  Sun Mar  4 07:32:13 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Sat, 3 Mar 2007 22:32:13 -0800
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was:
	morecomments on Paste Deploy)
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A86224D50@ex9.hostedexchange.local>

Chad Whitacre wrote:
> First, we define a "website" on the filesystem as a
> Unix-y userland with, at minimum, the following:
> 
>   etc/<foo>.conf
>   lib/python
> 
> Second, we adopt a simple ini-style format for <foo>.conf,
> which handles low-level process config. This file would
> then point to a second, framework-specific configuration
> layer.

I really don't see why we need a standard scaffolding (folder
arrangement) just to read in a config file. Why can't the
location of the site config file be passed as an argument
to the invocation script? Keep in mind that some platforms
will not allow deployers write access to any folders in which
application code is kept...


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20070303/64df1495/attachment.htm 

From grahamd at dscpl.com.au  Mon Mar  5 00:28:26 2007
From: grahamd at dscpl.com.au (Graham Dumpleton)
Date: Sun, 4 Mar 2007 18:28:26 -0500
Subject: [Web-SIG] Chunked Tranfer encoding on request content.
Message-ID: <1173050906.11628@dscpl.user.openhosting.com>

The WSGI specification doesn't really say much about chunked transfer encoding
for content sent within the body of a request. The only thing that appears to
apply is the comment:

  WSGI servers must handle any supported inbound "hop-by-hop" headers on their
  own, such as by decoding any inbound Transfer-Encoding, including chunked
  encoding if applicable.

What does this really mean in practice though?

As a means of getting feedback on what is the correct approach I'll go through
how the CherryPy WSGI server handles it. The problem is that the CherryPy
approach raises a few issues which makes me wander if it is doing it in the
most appropriate way.

In CherryPy, when it sees that the Transfer-Encoding is set to 'chunked' while
parsing the HTTP headers, it will at that point, even before it has called
start_response for the WSGI application, read in all content from the body of
the request.

CherryPy reads in the content like this for two reasons. The first is so that
it can then determine the overall length of the content that was available and
set the CONTENT_LENGTH value in the WSGI environ. The second reason is so that
it can read in any additional HTTP header fields that may occur in the trailer
after the last data chunk and also incorporate them into the WSGI environ.

The first issue with what it does is that it has read in all the content. This denies
a WSGI application the ability to stream content from the body of a request and
process it a bit at a time. If the content is huge, that it buffers it can also mean
the application process size will grow significantly.

The second issue, although I am confused on whether the CherryPy WSGI server
actually implements this correctly, is that if the client was expecting to see a
100 continue response, this will need to be sent back to the client before any
content can be read. When chunked transfer encoding is not used, such a 100
continue response would in a good WSGI server only be sent when the WSGI
application called read() on wsgi.input for the first time. Ie., the 100 continue
indicates that the application which is consuming the data is actually ready to
start processing it. What CherryPy WSGI server is doing is circumventing that and
the client could think the final consumer application is ready before it actually is.

Note that I am assuming here that 100 continue is still usable in conjunction
with chunked transfer encoding. In CherryPy WSGI server it only actually sends
the 100 continue after it attempts to try and read content in the presence of a
chunked transfer encoding header. Not sure if this is actually a bug or not.

CherryPy WSGI server also doesn't wait until first read() by WSGI application
before sending back the 100 continue either and instead sends it as soon as the
headers are parsed. This may be fine, but possibly not most optimal as it denies
an application the ability to fail a request and avoid a client sending the
actual content.

Now, to my mind, the preferred approach would be that the content would not
be read up front like this and instead CONTENT_LENGTH would simply be unset
in the WSGI environ.

>From prior discussions related to input filtering on the list, a WSGI
application shouldn't really be paying much attention to CONTENT_LENGTH anyway
and should just be using read() to get data until it returns an empty string.
Thus, for chunked data, that it doesn't know the content length up front
shouldn't matter as it should just call read() until there is no more. BTW, it may
not be this simple for something like a proxy, but that is a discussion for another
time.

Doing this also means that the 100 continue only gets sent when the application
is ready and there is no need to for the content to be buffered up.

That it is the actual application which is consuming the data and not some
intermediary means that an application could implement some mechanism whereby
it reads some data, acts on that and starts sending some data in response. The
client then might send more data based on that response which the application
only then reads, send more data as response etc. Thus an end to end
communication stream can be established where the actual overall content length
of the request could never be established up front.

The only problem with deferring any reading of data to when the application
wants to actually read it, is that if the overall length of content in the request
is bounded, there is no way to get access to the additional headers in the trailer
of the request and have them available in the WSGI environ since processing of
the WSGI environ has already occurred before any data was read.

So, what gives. What should a WSGI server do for chunked transfer encoding on
a request?

I may not totally understand 100 continue and chunked transfer encoding and
am happy to be correct in my understanding of them, but what CherryPy WSGI
server does doesn't seem right to me at first look.

Graham

From sidnei at enfoldsystems.com  Mon Mar  5 01:55:04 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Sun, 4 Mar 2007 21:55:04 -0300
Subject: [Web-SIG] Chunked Tranfer encoding on request content.
In-Reply-To: <1173050906.11628@dscpl.user.openhosting.com>
References: <1173050906.11628@dscpl.user.openhosting.com>
Message-ID: <a7a2b76b0703041655t2a2d2b2buf592e9fe023af213@mail.gmail.com>

I'm not quite aware of the 100 Continue semantics, but I know that
applications which request Transfer-Encoding: chunked should *not*
expect a Content-Length response header, nor should the WSGI thingie
doing the 'chunking' need to know it in advance.

'chunked' is actually very simple. Simplifying it a lot, it basically
needs to output '%x\r\n%s\r\n' % (len(chunk), chunk) for every chunk
of data except the last which should be '0\r\n\r\n'. The only trick
here is ensuring that no chunk of length '0' is written except the
last.

What might be happening is that CherryPy is outputting the whole
response body as a single chunk, and relying on the 'Content-Length'
header, which would be silly, I hope that's not what's happening
though I haven't looked.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From grahamd at dscpl.com.au  Mon Mar  5 02:33:38 2007
From: grahamd at dscpl.com.au (Graham Dumpleton)
Date: Sun, 4 Mar 2007 20:33:38 -0500
Subject: [Web-SIG] Chunked Tranfer encoding on request content.
Message-ID: <1173058418.9697@dscpl.user.openhosting.com>

Sidnei da Silva wrote ..
> I'm not quite aware of the 100 Continue semantics, but I know that
> applications which request Transfer-Encoding: chunked should *not*
> expect a Content-Length response header, nor should the WSGI thingie
> doing the 'chunking' need to know it in advance.
> 
> 'chunked' is actually very simple. Simplifying it a lot, it basically
> needs to output '%x\r\n%s\r\n' % (len(chunk), chunk) for every chunk
> of data except the last which should be '0\r\n\r\n'. The only trick
> here is ensuring that no chunk of length '0' is written except the
> last.
> 
> What might be happening is that CherryPy is outputting the whole
> response body as a single chunk, and relying on the 'Content-Length'
> header, which would be silly, I hope that's not what's happening
> though I haven't looked.

I am not talking about the response body. I am talking about the body of
the request. For example, the body of a POST request being sent from
client to server.

Graham

From fumanchu at amor.org  Mon Mar  5 03:02:25 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Sun, 4 Mar 2007 18:02:25 -0800
Subject: [Web-SIG] Chunked Tranfer encoding on request content.
References: <1173050906.11628@dscpl.user.openhosting.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A86224D54@ex9.hostedexchange.local>

Graham Dumpleton wrote:
> In CherryPy, when it sees that the Transfer-Encoding
> is set to 'chunked' while parsing the HTTP headers,
> it will at that point, even before it has called
> start_response for the WSGI application, read in all
> content from the body of the request.
> 
> CherryPy reads in the content like this for two reasons.
> The first is so that it can then determine the overall
> length of the content that was available and set the
> CONTENT_LENGTH value in the WSGI environ.

Right; IIRC the rfile just hangs if you try to read
past Content-Length. Perhaps that can be fixed inside
socket.makefile somewhere?

> The second reason is so that it can read in any
> additional HTTP header fields that may occur in
> the trailer after the last data chunk and also
> incorporate them into the WSGI environ.

Yeah; I didn't see any other way to get Trailers into
the environ. Perhaps that can be added to WSGI 2.0?

I also just haven't had time to write a dechunker
which worked on the fly. Patches welcome ;)

> When chunked transfer encoding is not used, such a
> 100 continue response would in a good WSGI server
> only be sent when the WSGI application called read()
> on wsgi.input for the first time.

Sounds reasonable. Again, patches welcome ;)

> Note that I am assuming here that 100 continue is
> still usable in conjunction with chunked transfer
> encoding. In CherryPy WSGI server it only actually
> sends the 100 continue after it attempts to try
> and read content in the presence of a chunked
> transfer encoding header. Not sure if this is
> actually a bug or not.

It looks like a bug. The Expect header should be
checked before decode_chunked (at least until the
100 response can be moved inside read()).

Thanks for catching those!


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20070304/643b065e/attachment.html 

From sidnei at enfoldsystems.com  Mon Mar  5 03:13:11 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Sun, 4 Mar 2007 23:13:11 -0300
Subject: [Web-SIG] Chunked Tranfer encoding on request content.
In-Reply-To: <1173058418.9697@dscpl.user.openhosting.com>
References: <1173058418.9697@dscpl.user.openhosting.com>
Message-ID: <a7a2b76b0703041813r57c84831n8ab6290ee952da13@mail.gmail.com>

On 3/4/07, Graham Dumpleton <grahamd at dscpl.com.au> wrote:
> I am not talking about the response body. I am talking about the body of
> the request. For example, the body of a POST request being sent from
> client to server.

Ah, ok. Anyway I don't see why it would need to read the whole body to
do chunked.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From grahamd at dscpl.com.au  Mon Mar  5 05:50:43 2007
From: grahamd at dscpl.com.au (Graham Dumpleton)
Date: Sun, 4 Mar 2007 23:50:43 -0500
Subject: [Web-SIG] Chunked Tranfer encoding on request content.
Message-ID: <1173070243.5536@dscpl.user.openhosting.com>

Robert Brewer wrote ..
> Graham Dumpleton wrote:
> > In CherryPy, when it sees that the Transfer-Encoding
> > is set to 'chunked' while parsing the HTTP headers,
> > it will at that point, even before it has called
> > start_response for the WSGI application, read in all
> > content from the body of the request.
> > 
> > CherryPy reads in the content like this for two reasons.
> > The first is so that it can then determine the overall
> > length of the content that was available and set the
> > CONTENT_LENGTH value in the WSGI environ.
> 
> Right; IIRC the rfile just hangs if you try to read
> past Content-Length. Perhaps that can be fixed inside
> socket.makefile somewhere?
> 
> > The second reason is so that it can read in any
> > additional HTTP header fields that may occur in
> > the trailer after the last data chunk and also
> > incorporate them into the WSGI environ.
> 
> Yeah; I didn't see any other way to get Trailers into
> the environ. Perhaps that can be added to WSGI 2.0?

Don't know how you could cater for trailers in WSGI 2.0 without coming up with
some totally new scheme of passing such additional information to the WSGI
application.

First idea I can think of at present is that if chunked transfer encoding
that WSGI server sets 'wsgi.trailers' as an empty dictionary which it keeps a
reference to and only populates when it actually encounters the trailers. Ie.,
only guaranteed to be set when read() finally returns an empty string. Any
middleware would have to be obligated to pass the reference though and not
actually copy the dictionary so that changes made later back at WSGI server
layer would be available to application.

Second idea I can think of is a new member function in 'wsgi.input' called
'trailers()' which could be used to access them. Alternatively, 'wsgi.trailers'
could also be a function. Either way, it could return None when not yet known
and dictionary when it is.

One problem with this is that in Apache, when the trailers are encountered, the
lower level HTTP filter simply merges them on top of the existing input headers.
You don't want to pass the full set of input headers again, so simply means the
WSGI adapter for Apache would need to remember what headers it sent in environ
to begin with and only put in trailers what had changed and thus were actually in
the trailer.

Anyway, it looks for the time being that if I am going to support streaming of
chunked data that I state as a limitation that trailers aren't available as WSGI
doesn't support a way of getting them.

BTW, I looked around at the various packages trying to provide a WSGI server
and I can't find one besides CherryPy WSGI server that even attempts to support
chunked encoding on input. Makes it hard to use what other people did as a
guide. :-(

Graham

From jim at zope.com  Mon Mar  5 12:28:15 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 06:28:15 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <435DF58A933BA74397B42CDEB8145A86224D4D@ex9.hostedexchange.local>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
	<435DF58A933BA74397B42CDEB8145A86224D4D@ex9.hostedexchange.local>
Message-ID: <82E00AFB-0425-487C-A55B-1BD5DAE6E247@zope.com>


On Mar 3, 2007, at 6:19 PM, Robert Brewer wrote:

> Jim Fulton wrote:
> > For some time, Zope has used a daemon-management tool
> > we wrote called zdaemon:
> >
> >    http://www.python.org/pypi/zdaemon
> >
> > Ironically, this sort of tool isn't Python specific at all,
> > and the discussion highlighted some non-Python tools, notably
> > daemontools and runit, neither of which seemed as appealing
> > as zdaemon for various reasons.
>
> The user interface isn't Python-specific, but the interaction with  
> WSGI servers, middleware, applications, and frameworks should be.
I don't think we are talking about the same thing. See my comment at  
the end of this note.
> Components at all levels of the WSGI stack need to interact with  
> "site-wide" events and settings. What I'm envisioning (and writing  
> for CP at the moment) is a framework-neutral, one-per-site Engine  
> object that is basically a publish/subscribe messenger; when you  
> import a Python web framework, it registers listeners for process  
> start, stop, and graceful restart. These would be things that need  
> to happen regardless of the OS process invoker: whether a common  
> 'webctl' script (that we author), or a framework-specific function  
> (like cherrypy.quickstart), or Apache (via mod_python).
I encourage you to look at the zope event system which already  
supports this use case:

   http://www.python.org/pypi/zope.event
   http://www.python.org/pypi/zope.component#handlers

> The pub/sub model also supports plugins with their own channel(s).  
> For example, frameworks would blindly call engine.publish 
> ('autoreload.add', filename) as desired. If the invoker (webctl,  
> quickstart, or Apache) plugs in an autoreloader, great; it  
> subscribes to that channel, receives each message, and adds each  
> filename to its list of files to monitor. If no autoreloader has  
> been plugged in, the 'add' message is correctly ignored. And when  
> the autoreloader detects a change, it would also publish 'reload'  
> or 'reexec' messages, which would then be subscribed to by a Reexec  
> plugin. Most of the plugins would be provided by the invoker, but  
> frameworks would be free to use the Engine to register their own  
> events and event listeners.
>
> This interface between a site-wide container and the WSGI  
> components is far more important to me than the actual details of  
> invocation (like forking, signal-handling, logging, etc). The  
> latter can be written as Engine plugins, and can compete in a  
> market created by a good "Web Site Engine Interface" spec.
I think you're "sitewide container" is the main program that loads  
the WSGI components.  This might be Apache, if mod_python is used, or  
some Python script/program.  I was discussing a tool that managed the  
main program in the later case. Something that started and restarted  
it, provided status information, helped it to run as a proper daemon  
and so on.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Mon Mar  5 12:59:09 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 06:59:09 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
Message-ID: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>


On Mar 3, 2007, at 11:27 PM, Chad Whitacre wrote:
...
> Now, Jim: it looks like Zope still uses a Unix-y userland for
> INSTANCE_HOME, yes?

Yes, but I hate it.  At Zope Corporation, We're moving away from it  
for a number of reasons.

For development, it adds structure that isn't needed.  A Zope  
instance really only needs a few files.  Trying to minic some  
notional unix layout just adds pointless structure.

The traditional complex Zope instance file layout lead to the use of  
an instance "skeleton" to deal with all of the files, which led, in  
turn, to a copy and hack style of configuration customization that is  
inflexible and encourages cruft.

For production deployments, we (Zope Corporation) install files into  
the *real* Unix tree where site administrators want them.  We'll  
typically have a deployment that includes a number of applications.   
The deployment will create directories in /etc, /var/log, and /var/ 
run, where the applications in the deployment put their  
configuration, log, and run-time files.  They may also put files in  
places like /etc/init.d, and /etc/cron.d.  The point being that this  
looks nothing like a traditional Zope instance installation.

Keeping the number of files used by an application minimal makes it  
easier deal with the different needs of development and deployment  
and makes it easier, at least for me, to deal with different  
configurations.

...

>   1) Server benchmarks and inter-op standards (Jim)

Ian said he would lead this.

2) Common framework for WSGI application composition.

>   2) Common process management library (Bob)
>   3) Common web app server

Not sure what this is.

>
> Without discouraging the first two efforts, I'd like to champion the
> third. Here would be my proposal:
>
> First, we define a "website" on the filesystem as a Unix-y userland
> with, at minimum, the following:
>
>   etc/<foo>.conf
>   lib/python

-1 for reasons I've already described

I'll note that I find lib/python especially silly.  Why have a lib  
directory that contains a single subdirectory.  We started this a  
long long time ago with Zope because that's how Python installed it's  
own modules on Unix systems at the time. Since then. Python has  
switched to lib/pythonV.V.  We don't mimic that for hysterical  
reasons.  If someone really wanted to mimic how modules got installed  
into modern Unix Python installs, they'd use lib/pythonV.V/site- 
packages, which would be the height of absurdity.

In practice, at least for us at Zope Corporation, our process  
instances don't have any Python modules.  We have application  
definitions that contain the modules we use and multiple process  
instances of each application that contain only configuration data.

> Second, we adopt a simple ini-style format for <foo>.conf, which
> handles low-level process config. This file would then point to a
> second, framework-specific configuration layer.

We do something like this now.  It don't require any particular file- 
system layout.

The devil is in the details.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Mon Mar  5 13:05:27 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 07:05:27 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E9E091.3070603@colorstudy.com>
References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com>
Message-ID: <B6E2D9DD-0E84-4EA4-9E58-BFDE15A440D0@zope.com>


On Mar 3, 2007, at 3:54 PM, Ian Bicking wrote:

> Chad Whitacre wrote:
>> All,
>>
>> Thanks, Jim and Ian, for bringing this discussion online.
>>
>> I have two hesitations with Paste Deploy:
>>
>>    1. The configuration syntax is really complex. I'm much more
>>       comfortable with multiple simpler config files.
>
> Is it really that complex?

I don't think so, otoh, you make some good points. :)


> There's a few too many ways to do middleware
> around applications, I'm afraid.

Yes

>   get/set is really a rather obscure
> feature that I seldom use.

I don't remember seeing this in the documentation.

>   The distinction between "composite" and
> "app" isn't necessary, I think.

Agreed

> The ability to inherit from sections is really useful IMHO (though not
> well described in documentation); that's where you do something like
> "use = other_section", and then add settings that override that other
> section's settings.

Yes, but the way it is overloaded with selecting an entry point and  
referring to another configuration file is confusing.  I

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Mon Mar  5 13:06:22 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 07:06:22 -0500
Subject: [Web-SIG] My summary of a web-platform Open-Space discussion
	at	PyCon 2007
In-Reply-To: <45E99FE1.1090307@zetaweb.com>
References: <52A740B9-536E-4946-B576-6BA818DF0730@zope.com>
	<45E99FE1.1090307@zetaweb.com>
Message-ID: <1FF57434-BFD2-4892-B724-71D2D616250B@zope.com>


On Mar 3, 2007, at 11:18 AM, Chad Whitacre wrote:

> Jim,
>
> > I'll summarize my recollections of a very useful discussion
> > that several of us had at PyCon 2007.
>
> Looks accurate to me, thanks.
>
>
> > - Ian will lead a server benchmark effort
>
> Where by "server," we mean core HTTP server library, yes?

Yes, WSGI server implementatuo


> > My impression is that there isn't a lot of appetite for
> > standardizing on a common pain application.
>
> Sorry, "pain application?" :^)

:)

"main application".

> I assume you mean a common app server executable, as opposed to  
> best practice docs, entry point standards, maybe even libraries,  
> etc. Yes?

Yes.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From sidnei at enfoldsystems.com  Mon Mar  5 15:16:30 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 5 Mar 2007 11:16:30 -0300
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
Message-ID: <a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>

On 3/5/07, Jim Fulton <jim at zope.com> wrote:
> For production deployments, we (Zope Corporation) install files into
> the *real* Unix tree where site administrators want them.  We'll
> typically have a deployment that includes a number of applications.
> The deployment will create directories in /etc, /var/log, and /var/
> run, where the applications in the deployment put their
> configuration, log, and run-time files.  They may also put files in
> places like /etc/init.d, and /etc/cron.d.  The point being that this
> looks nothing like a traditional Zope instance installation.

How do you see that mapping to win32? There's no '/etc', '/etc/init.d'
equivalent would be the current 'zopeservice.py', and '/etc/cron.d'
equivalent would be 'scheduled tasks'. I believe '/var/log' could be
replaced by logging to the 'nt event log', there are lots of tools to
work with that. That still leaves '/etc/' and '/var/run' in the air. I
guess they could just be right into the application directory?

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From jim at zope.com  Mon Mar  5 16:02:42 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 10:02:42 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E99DC1.4010703@zetaweb.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
Message-ID: <57C175B1-A485-4FEF-908C-7B849F576D5E@zope.com>


On Mar 3, 2007, at 11:09 AM, Chad Whitacre wrote:
...
> > 1. Can we agree on a standard set of entry points so that WSGI
> > applications can be combined automatically?  I think Paste
> > Deploy provides at least good start on this.
> >
> > You haven't commented on the entry points defined by Paste
> > Deploy.  Do you have an opinion on adopting the entry-point API
> > defined by Paste Deploy?
>
> Ok, I need help: defining an entry point allows a plugin to  
> advertise that it can satisfy that entry point, but you still need  
> a configuration layer to actually wire it up, no?

Yes.

> In which case:
>
>   1) What does "automatically" mean?

It means that you don't have to write Python code to connect  
applications, servers, and middleware.

>   2) Aren't we back to discussing config syntax?

No. Entry points can be used by a variety of configuration syntaxes  
and by Python code.

I should note that we can divide this discussion further, if we wish.

Paste Deploy defines APIs and entry points for advertising objects  
that provide those APIs.  The APIs are arguably the most essential  
thing to reuse from Paste Deploy.

Entry points add *a* mechanism to make those objects a bit more  
discoverable.  Arguably, specifying an application via:  
eggname#entrypointname doesn't provide much advantage over simply  
specifying the dotted path to an object in a module.  If there were  
more tools for browsing for and working with eggs, then I think entry  
points would provide greater advantages as they would allow the tools  
to guide someone deciding how to reuse an egg by telling them about  
the components available. Personally, I think that use of entry  
points makes sense in a situation like this.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Mon Mar  5 18:14:55 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 12:14:55 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>
Message-ID: <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com>


On Mar 5, 2007, at 9:16 AM, Sidnei da Silva wrote:

> On 3/5/07, Jim Fulton <jim at zope.com> wrote:
>> For production deployments, we (Zope Corporation) install files into
>> the *real* Unix tree where site administrators want them.  We'll
>> typically have a deployment that includes a number of applications.
>> The deployment will create directories in /etc, /var/log, and /var/
>> run, where the applications in the deployment put their
>> configuration, log, and run-time files.  They may also put files in
>> places like /etc/init.d, and /etc/cron.d.  The point being that this
>> looks nothing like a traditional Zope instance installation.
>
> How do you see that mapping to win32? There's no '/etc', '/etc/init.d'
> equivalent would be the current 'zopeservice.py', and '/etc/cron.d'
> equivalent would be 'scheduled tasks'. I believe '/var/log' could be
> replaced by logging to the 'nt event log', there are lots of tools to
> work with that. That still leaves '/etc/' and '/var/run' in the air. I
> guess they could just be right into the application directory?

We don't deploy to win32 and I don't know enough about win32 to  
answer.  I expect though that, like Unix, a production deployment is  
going to look different than a development buildout.  In any case,  
I'm pretty sure that the classic unix-mimicing layout has no  
advantages for win32. :)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From sidnei at enfoldsystems.com  Mon Mar  5 18:25:06 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 5 Mar 2007 14:25:06 -0300
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>
	<660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com>
Message-ID: <a7a2b76b0703050925h7085906ck55705ce75b8780@mail.gmail.com>

On 3/5/07, Jim Fulton <jim at zope.com> wrote:
> We don't deploy to win32 and I don't know enough about win32 to
> answer.  I expect though that, like Unix, a production deployment is
> going to look different than a development buildout.  In any case,
> I'm pretty sure that the classic unix-mimicing layout has no
> advantages for win32. :)

Well, it is something that needs to be considered though. We can't
just close one eye and pretend that win32 does not exist.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From jtate at rpath.com  Mon Mar  5 18:54:56 2007
From: jtate at rpath.com (Joseph Tate)
Date: Mon, 5 Mar 2007 12:54:56 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45E9E091.3070603@colorstudy.com>
References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com>
Message-ID: <200703051254.57032.jtate@rpath.com>

On Saturday 03 March 2007 15:54:41 Ian Bicking wrote:
> Chad Whitacre wrote:
> > I suggest that a system with multiple simple config files is much
> > more scalable than a single complex config file syntax. Imagine
> > if all of Unix were configured using a single syntax!
>
> There's other cases where having both options is nice.  Because Paste
> Deploy doesn't fold config files together, you can also reuse them from
> different contexts.  (A more common way to use multiple config files --
> what ConfigParser.load supports -- is to just overlap all the sections,
> usually totally clobbering each other.  I like this more explicit way of
> bringing in configuration, which treats configuration like a composable
> set of configurations instead of a system where all the configuration
> files are pretty tightly bound to each other.)

I find that multiple files gives you a nice way to override defaults.  As long 
as the files are read in a way that's predictable and documentable, and 
ultimately appear as if read from a single file (and possible displayable via 
some diagnostics link in an application).

-- 
Joseph Tate
Software Engineer
rPath Inc.
http://www.rpath.com/rbuilder/
(919) 851-3984 x2106

From jtate at rpath.com  Mon Mar  5 18:25:10 2007
From: jtate at rpath.com (Joseph Tate)
Date: Mon, 5 Mar 2007 12:25:10 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
Message-ID: <200703051225.10896.jtate@rpath.com>

On Saturday 03 March 2007 11:08:24 Jim Fulton wrote:
>
> Anyway, I share this for your consideration.  There are probably
> better tools out there than zdaemon and supervisor2, but I'm not
> aware of them. :)  I'm curious what other people have found or use.

ll.daemon (http://www.livinglogic.de/Python/daemon/index.html) seems to be a 
straightforward and very simple library for core daemon functionality.

Daemontools isn't very well respected by the SysV style initscript crowd, and 
vice versa.  That's an external non python dependency, and not commonly 
available.  Certainly not available on Windows.

I have written my own daemon base class (Pretty restrictive license 
[reciprocal], but I'm sure I could get it loosened).  
http://hg.rpath.com/raa-1.1?f=9ac380d082f4;file=raa/service/daemon.py I'm not 
married to it though, so would be happy to spin it out and remove the conary 
requirements, or just junk it.


-- 
Joseph Tate
Software Engineer
rPath Inc.
http://www.rpath.com/rbuilder/
(919) 851-3984 x2106

From chad at zetaweb.com  Mon Mar  5 19:14:27 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Mon, 05 Mar 2007 13:14:27 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <200703051225.10896.jtate@rpath.com>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
	<200703051225.10896.jtate@rpath.com>
Message-ID: <45EC5E03.2070304@zetaweb.com>

 > ll.daemon (http://www.livinglogic.de/Python/daemon/index.html)
 > seems to be a straightforward and very simple library for core
 > daemon functionality.

I'm using this in Aspen, and I like it. Worth checking out.


chad

From smulloni at smullyan.org  Mon Mar  5 18:57:40 2007
From: smulloni at smullyan.org (Jacob Smullyan)
Date: Mon, 5 Mar 2007 12:57:40 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <a7a2b76b0703050925h7085906ck55705ce75b8780@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>
	<660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com>
	<a7a2b76b0703050925h7085906ck55705ce75b8780@mail.gmail.com>
Message-ID: <20070305175740.GA7319@smullyan.org>

On Mon, Mar 05, 2007 at 02:25:06PM -0300, Sidnei da Silva wrote:
> Well, it is something that needs to be considered though. We can't
> just close one eye and pretend that win32 does not exist.

Yes, I prefer to close two eyes!


-- 
Jacob Smullyan

From jtate at rpath.com  Mon Mar  5 19:27:23 2007
From: jtate at rpath.com (Joseph Tate)
Date: Mon, 5 Mar 2007 13:27:23 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
Message-ID: <200703051327.23326.jtate@rpath.com>

On Saturday 03 March 2007 23:27:29 Chad Whitacre wrote:
>   3) Common web app server
>
> Without discouraging the first two efforts, I'd like to champion the
> third. Here would be my proposal:
>
> First, we define a "website" on the filesystem as a Unix-y userland
> with, at minimum, the following:
>
>   etc/<foo>.conf
>   lib/python
>
<snip>
> Are you guys interested in this proposal? If so, I can write it up in
> more detail.

No, and here's why.  Most apps are deployed as eggs.  This is a relatively 
high ante to pay for complicated setups, but boilerplate setup.py code solves 
the 80% case well enough.  Using eggs means that the apps could be installed 
in different locations, site-packages, user's own pythonpath, anywhere.  The 
config file or files are going to be what determines what gets loaded, and 
where, much more than os.getcwd().  The configuration can be determined via 
searching well known locations /etc/foo.cfg ~/.foo.cfg ./foo.cfg, etc. or 
passed in on the command line.[1]  References to apps will be to their eggs, 
which will be loaded from the Python path.  Installing eggs to arbitrary file 
system locations, while it can be done, doesn't lend itself to 
super-packaging (rpm, dpg, installshield, etc).  It also requires more setup 
by the end user/deployer than just running ez_install foo_app, or rpm -i 
foo_app.rpm.

Also, user-land servers are not that interesting to me.  They're great for 
development, but production use is where I see the pain.

I'm interested in a common app server platform that focuses on running one or 
more applications from an egg (which could, and perhaps should include it's 
own configuration) mounted at different url locations.

[1] This use of current working directory for configuration file loading could 
be used in the specialized aspen case for an exploded egg in an arbitrary 
file location.
 

-- 
Joseph Tate
Software Engineer
rPath Inc.
http://www.rpath.com/rbuilder/
(919) 851-3984 x2106

From fumanchu at amor.org  Mon Mar  5 19:38:51 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Mon, 5 Mar 2007 10:38:51 -0800
Subject: [Web-SIG] daemon tools
In-Reply-To: <82E00AFB-0425-487C-A55B-1BD5DAE6E247@zope.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>

Jim Fulton wrote:
> For some time, Zope has used a daemon-management tool
> we wrote called zdaemon:
>
>    http://www.python.org/pypi/zdaemon
> 
> Ironically, this sort of tool isn't Python specific at all,
> and the discussion highlighted some non-Python tools, notably
> daemontools and runit, neither of which seemed as appealing
> as zdaemon for various reasons.

and Robert Brewer replied:
> The user interface isn't Python-specific, but the interaction with  
> WSGI servers, middleware, applications, and frameworks should be.

and Jim answered:
> I don't think we are talking about the same thing...
>
> I encourage you to look at the zope event system which already  
> supports this use case:
> 
>    http://www.python.org/pypi/zope.event


Yes, and Django has a similar mechanism which they call "signals":

    http://code.djangoproject.com/wiki/Signals

What several people have asked for is the ability to combine
applications (and WSGI components) from a variety of frameworks into a
single "website". What I'm proposing is that we standardize on a set of
topics/channels/events/signals that are "site-wide" events, like start,
stop, restart and graceful. If we collaborated on a tool to manage
those, we could potentially make the codebases of each project smaller,
not just by removing the event manager, but by collaborating on a set of
standard event handlers, one of which could be a "daemonize me" handler.

What we have now:

    CherryPy              Zope              Django
    --------             ------             -------
      ???                events             signals
       |                    |                  |
    autoreload             ???             autoreload
       |                    |                  |
    engine                zdrun               ???
       |                    |                  |
      ???                 zdctl               ???

What we could have instead:

                      webctl     modpython_gateway
                         |           /
         ------------ pywebd ------------
        /                |               \
    --------          ------           ------
    CherryPy           Zope            Django


...where the "pywebd" module:

 1. Composes the WSGI stack (provides a library to do so at least),
 2. Notifies frameworks of site-wide events (like start, stop, restart
and graceful),
 3. Provides plugins that frameworks can "notify"; for example, adding
files to an autoreload plugin.

> I think your "sitewide container" is the main program that loads
> the WSGI components.  This might be Apache, if mod_python is
> used, or some Python script/program.

Apache itself is not going to be the chunk of code that loads the WSGI
components. In my head, a modpython_gateway module (or something
similar) would ask pywebd to do that.

> I was discussing a tool that managed the main program in the
> latter case. Something that started and restarted it, provided
> status information, helped it to run as a proper daemon and so on.

Sure, something like zdctl? But zdctl doesn't do the actual fork, zdrun
does...so what does "help run as a proper daemon" mean?


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From sidnei at enfoldsystems.com  Mon Mar  5 19:42:28 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 5 Mar 2007 15:42:28 -0300
Subject: [Web-SIG] The importance of deploying Python-based web apps on
	Windows (was: Re: [Proposal] "website" and first-level conf)
Message-ID: <a7a2b76b0703051042y46eb7d2bx3d4834f7d9866cdd@mail.gmail.com>

On 3/5/07, Jacob Smullyan <smulloni at smullyan.org> wrote:
> On Mon, Mar 05, 2007 at 02:25:06PM -0300, Sidnei da Silva wrote:
> > Well, it is something that needs to be considered though. We can't
> > just close one eye and pretend that win32 does not exist.
>
> Yes, I prefer to close two eyes!

I seriously hope you are kidding.

Unfortunately that's not possible. A lot of people, specially when
evaluating open-source projects, have their first contact with the
software through the Windows platform. To quote some numbers, the
Plone Installer for Windows has roughly 3x more downloads than any of
the second most download package [1].

Now, I see clearly two options for open-source projects: have a
Windows story and increase your downloads by X%, where X can be a
number between 50-300 *wink*, or not have a Windows story and relying
on the *nix crowd to be the sole consumers of your software.

When you talk to a big organization that is already deploying their
applications on the Windows platform what story you want to tell them?
'Oh, and by the way, all your investment on Windows software, you will
have to throw all that away if you want to use our software'. Good
luck with that.

I think that it's pretty important that Python-based web apps have as
good of a story on Windows as it has in other fields (pywin32 comes to
mind) but feel free to disagree.

Sorry for the rant.

[1] http://tinyurl.com/2dfx37

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From jim at zope.com  Mon Mar  5 21:23:56 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 5 Mar 2007 15:23:56 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <a7a2b76b0703050925h7085906ck55705ce75b8780@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>
	<660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com>
	<a7a2b76b0703050925h7085906ck55705ce75b8780@mail.gmail.com>
Message-ID: <33811262-2044-4B84-8921-9BC481564213@zope.com>


On Mar 5, 2007, at 12:25 PM, Sidnei da Silva wrote:

> On 3/5/07, Jim Fulton <jim at zope.com> wrote:
>> We don't deploy to win32 and I don't know enough about win32 to
>> answer.  I expect though that, like Unix, a production deployment is
>> going to look different than a development buildout.  In any case,
>> I'm pretty sure that the classic unix-mimicing layout has no
>> advantages for win32. :)
>
> Well, it is something that needs to be considered though. We can't
> just close one eye and pretend that win32 does not exist.

I wasn't suggesting we shouldn't consider it.  I just don't think  
win32 will change my opinion of what I think about a unix-inspired  
instance layout.

Someone should think about windows who actually uses it.  I am not a  
windows server administrator, so I can't suggest how deploying  
applications on windows servers would effect file placement or layout.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From sidnei at enfoldsystems.com  Mon Mar  5 21:48:35 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 5 Mar 2007 17:48:35 -0300
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
	comments on Paste Deploy)
In-Reply-To: <33811262-2044-4B84-8921-9BC481564213@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<a7a2b76b0703050616r540e35fdif01cc36362a57801@mail.gmail.com>
	<660BDDBF-79DA-43AF-8B22-AB7584230A63@zope.com>
	<a7a2b76b0703050925h7085906ck55705ce75b8780@mail.gmail.com>
	<33811262-2044-4B84-8921-9BC481564213@zope.com>
Message-ID: <a7a2b76b0703051248u2bf9d3dfjcda4733f2517d806@mail.gmail.com>

On 3/5/07, Jim Fulton <jim at zope.com> wrote:
> > On 3/5/07, Jim Fulton <jim at zope.com> wrote:
> >> We don't deploy to win32 and I don't know enough about win32 to
> >> answer.  I expect though that, like Unix, a production deployment is
> >> going to look different than a development buildout.  In any case,
> >> I'm pretty sure that the classic unix-mimicing layout has no
> >> advantages for win32. :)
> >
> > Well, it is something that needs to be considered though. We can't
> > just close one eye and pretend that win32 does not exist.
>
> I wasn't suggesting we shouldn't consider it.  I just don't think
> win32 will change my opinion of what I think about a unix-inspired
> instance layout.
>
> Someone should think about windows who actually uses it.  I am not a
> windows server administrator, so I can't suggest how deploying
> applications on windows servers would effect file placement or layout.

Thanks for the clarification.

So can I suggest that when the tools for deploying are created, that
they be extensible so that someone can come in after the fact and put
the win32-specific code in place without having to rewrite everything
from scratch?

Things that come to my mind are:

  - logging (should be able to swap file-based logging by nt event log
logging for example). With ZConfig/zope.conf this an easy task.
  - 'cron'-like things, should be able to read settings from a file
and install scheduled tasks that run the same scripts on Windows
  - 'service' code, should be able to have a generic service wrapper
that can run anything as a service.
  - Application shouldn't rely on *nix signals, or should be made
extensible to handle Windows 'named events', which are equivalent but
not quite the same.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From ianb at colorstudy.com  Mon Mar  5 22:19:14 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 05 Mar 2007 15:19:14 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <200703051254.57032.jtate@rpath.com>
References: <45E8EB97.6090805@zetaweb.com> <45E9E091.3070603@colorstudy.com>
	<200703051254.57032.jtate@rpath.com>
Message-ID: <45EC8952.1040703@colorstudy.com>

Joseph Tate wrote:
> On Saturday 03 March 2007 15:54:41 Ian Bicking wrote:
>> Chad Whitacre wrote:
>>> I suggest that a system with multiple simple config files is much
>>> more scalable than a single complex config file syntax. Imagine
>>> if all of Unix were configured using a single syntax!
>> There's other cases where having both options is nice.  Because Paste
>> Deploy doesn't fold config files together, you can also reuse them from
>> different contexts.  (A more common way to use multiple config files --
>> what ConfigParser.load supports -- is to just overlap all the sections,
>> usually totally clobbering each other.  I like this more explicit way of
>> bringing in configuration, which treats configuration like a composable
>> set of configurations instead of a system where all the configuration
>> files are pretty tightly bound to each other.)
> 
> I find that multiple files gives you a nice way to override defaults.  As long 
> as the files are read in a way that's predictable and documentable, and 
> ultimately appear as if read from a single file (and possible displayable via 
> some diagnostics link in an application).

Allowing this sort of thing means that the application carries around a 
complete config object of some sort, which I rather dislike -- it allows 
for smart applications, but it makes it much harder to understand the 
configuration and any possible side effects.  If we resolve the 
configuration down to something more limited (as the Paste Deploy entry 
points do) you can't really reconstruct the config from there. 
*Something* could still reconstruct the config (an alternate config 
loader, via logs, via debug settings, etc), just not the application itself.

This is somewhat problematic for applications that have particularly 
complex config requirements, or want to support self-configuration.  The 
best solution that I can think of with Paste Deploy in that case is to 
just use the Paste Deploy configuration to point to the "real" 
configuration.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From ianb at colorstudy.com  Mon Mar  5 22:23:46 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 05 Mar 2007 15:23:46 -0600
Subject: [Web-SIG] [Proposal] "website" and first-level conf (was: more
 comments on Paste Deploy)
In-Reply-To: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
Message-ID: <45EC8A62.8060805@colorstudy.com>

Chad Whitacre wrote:
>> >> >    2. I'm not clear on how Paste Deploy's abstractions map to the
>> >> >       filesystem. What does my website root look like?
>> >>
>> >> The way I have generally configured websites like this is like:
>> >>
>> >>    [composite:main]
>> >>    use = egg:Paste#urlmap
>> >>    / = config:root.ini
>> >>    /blog = config:superblog.ini
>> >
>> > Right, that's the configuration, but where is "egg:Paste#urlmap" on
>> > the filesystem? Are the three ini files alone in some directory? Where
>> > is paste? Where is SuperBlog? Where is the rest of the site? I find it
>> > easier to start with the filesystem and then move up into
>> > object/config abstractions.
>>
>> You just have to understand what egg:Paste#urlmap is, probably from some
>> documentation.  Admittedly that's boilerplate in the eyes of most people
>> who use it.  It's there explicitly because Paste Deploy doesn't build
>> *any* WSGI anything into it, it only composes pieces, one of the most
>> common being urlmap.  You can see docs for it with "paster points
>> paste.composite_factory urlmap", though I now notice I haven't written
>> any docs for it (bad of me), and that is hardly a simple command line.
>> I would certainly want to build a command-line help/browser (and
>> probably web one too) as part of a rewrite of the system.
>>
>> The three ini files do go in the same directory, though of course you
>> could do config:superblog/app.ini or something like that if you wanted
>> to set it up differently.  It's a relative filename, relative to the
>> file where it is given.
>>
>> The applications themselves are eggs.  You install them however you want
>> to install them (of course I'd strongly recommend workingenv,
>> virtual-python, or zc.buildout, but that's a separate concern).  Some
>> people have mentioned some frustration about having to build full
>> libraries with a namespace, setup.py, eggs, etc. just to use
>> applications.  But I think even pretty modest shops writing very one-off
>> apps gain a real benefit from these patterns, once you get over the
>> initial hump (and we can build tools to make the initial hump not so
>> bad, that's the point of paster create).  Anyway, here's one reply I
>> made to that request:
>> http://pythonpaste.org/archives/message/20070215.192041.1534ce27.en.html
>>
>> There's a lot of practices around library management that *has* to be
>> done, because people use libraries.  Most of this applies pretty well to
>> applications as well -- and since everyone *needs* to learn how to
>> manage their libraries, using the same mechanisms for managing
>> applications has some advantage.
>>
>> Incidentally, one change to the config format that would make it
>> possible to remove the explicit idea of "composite" apps, is to make
>> some key syntax that will instantiate the named object.  E.g.,:
>>
>>    app / = config:root.ini
>>
>> Then the keywords passed would just be {"/": <actual WSGI app>}, instead
>> of the current {"/": "config:root.ini"} (where the "config:root.ini" is
>> passed to the loader object that the composite factory gets).
> 
> Dude, my eyes are seriously glazing over. I want you to say something
> simple, like:
> 
>  $ cd /usr/local/www
>  $ workingenv.py example.com
>  ...
>  $ cd example.com
>  $ source bin/activate
>  (example.com)$ mkdir etc
> 
>  Then stick a config file in etc/ and run a simple command to start
> your website.

But you are just hand-waving over the exact part that I am describing 
("stick a config file in etc/").  What does that config file look like? 
  How do you handle different cases with it?  I cover a lot of pretty 
normal use cases up there.

> That's the kind of thing I imagine you doing (eh?), and it's also the
> thing that Aspen does. The difference is mostly in the config files.
> 
> Now, Jim: it looks like Zope still uses a Unix-y userland for
> INSTANCE_HOME, yes? So that's Paste, Pylons(?), Aspen, Zope2 and Zope3
> all using the same filesystem layout. IINM the filesystem structures
> of Django and CP/TurboGears are module-level (Bob?), so they could
> easily fit into lib/python.
> 
> If we could agree on a really simple first-line config file that
> handles basic process configuration--address, user/group, threads,
> etc.--and then points to the next layer config--be it zope.conf,
> paste.ini, apps.conf, or settings.py--then we'd be pretty far towards
> a common app server.

Part of why I push Paste Deploy is because every simpler or more 
abstract config idea could just as easily be composed as a Paste Deploy 
entry point.  That is, one can create the abstract idea of a config 
loader, but that requires all the same boiler plate that a minimal Paste 
Deploy config file has anyway.  Which is not to say someone might not 
want to write a different loader, but I don't think adding another layer 
of abstraction that's more neutral helps.

> That is to say, I think we are really discussing three increasing
> levels of cooperation:
> 
>  1) Server benchmarks and inter-op standards (Jim)
>  2) Common process management library (Bob)
>  3) Common web app server
> 
> Without discouraging the first two efforts, I'd like to champion the
> third. Here would be my proposal:
> 
> First, we define a "website" on the filesystem as a Unix-y userland
> with, at minimum, the following:
> 
>  etc/<foo>.conf
>  lib/python
> 
> Second, we adopt a simple ini-style format for <foo>.conf, which
> handles low-level process config. This file would then point to a
> second, framework-specific configuration layer.

If it's framework-specific, how do you determine what the framework is? 
   You need some kind of slug to do that, or else a separate runner. 
That also doesn't really do anything for composing multiple different 
applications that happen to use different frameworks.  Personally I find 
framework-specific configuration rather dumb, because the point of all 
this isn't to build *frameworks*, it's to build *applications*, and 
frameworks are just an implementation details of an application.

One could say that it would be better if the application shipped its own 
setup, meaning its own appctl script.  This doesn't allow very well for 
wrapping or composing applications, but it's a valid thing to provide. 
But I don't think your proposal goes in that direction.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From pje at telecommunity.com  Mon Mar  5 22:38:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 05 Mar 2007 16:38:51 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <57C175B1-A485-4FEF-908C-7B849F576D5E@zope.com>
References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
Message-ID: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>

At 10:02 AM 3/5/2007 -0500, Jim Fulton wrote:
>Entry points add *a* mechanism to make those objects a bit more
>discoverable.  Arguably, specifying an application via:
>eggname#entrypointname doesn't provide much advantage over simply
>specifying the dotted path to an object in a module.

Actually, it provides one very important strategic advantage that I don't 
think has been mentioned in this conversation.  A configuration format that 
can specify project/version information can be used as a single-file 
deployment spec for an easy_install wrapper or buildout-like tool.

The advantage of this for virtual hosting providers in particular is 
significant -- if they support the tool, they can support this one-file 
deployment scheme.

Personally, I don't care for the Paste Deploy syntax -- frankly it's almost 
barbaric.  :)  But the concept of being able to specify stacks, routes, and 
configuration in a plain text format that includes package information for 
automated deployment is nonetheless an important one.

A couple years back, I started writing a library to parse a more 
sophisticated, Python-like syntax to do the same sorts of things, but only 
got as far as the parser.

One discussion was here:

http://mail.python.org/pipermail/web-sig/2005-August/001714.html

The basic idea behind the syntax was that assignments are like keyword 
arguments, and non-assignment statements are positional arguments.

I'm not altogether happy with that syntax either, however, as it has a 
little too much "more than one way to do it", which is one reason I never 
finished the implementation.  There is a library that parses it (and does 
other general-purpose Python-like DSL parsing) at:

ViewSVN:   http://svn.eby-sarna.com/SCALE/
Checkout:  svn://svn.eby-sarna.com/svnroot/SCALE/
Docs: 
http://peak.telecommunity.com/DevCenter/scale.dsl#parsing-declarations

Anyway, all that aside, I think it would be fantastic if we could come up 
with some "universal file format" for single-file configuration and 
deployment of applications (including auto-install of all needed eggs), 
that could get stdlib support and ultimately hosting company support.  This 
would actually give us a leg up on even PHP for ease-of-deployment.

In truth, it doesn't matter if the file *contents* are 
standardized.  Standardization could be as simple as defining a #! line like:

#!/usr/bin/pydeploy2.3 SomeFormatEgg==1.1

Where "SomeFormatEgg" offers a "python.deploy" entry point for running the 
file, and the pydeploy tool obtains the necessary egg and provides 
libraries for the parsing tool to auto-locate and install any eggs needed 
by the body.

This could also be a basis for bootstrapping other systems, including 
perhaps buildouts (e.g. "#!/usr/bin/pydeploy2.4 zc.buildout" at the top of 
a buildout .ini)!

So, while a single content format would be nice, we don't even need that in 
order to get a raw deployment system standard.  Perhaps I should build this 
hypothetical pydeploy tool into setuptools 0.7?


From pje at telecommunity.com  Mon Mar  5 22:39:23 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 05 Mar 2007 16:39:23 -0500
Subject: [Web-SIG] wsgiref and wsgi.multithread/wsgi.multiprocess
In-Reply-To: <20070209175649.GA21915@caltech.edu>
References: <5.1.1.6.0.20070209120902.038b7e20@sparrow.telecommunity.com>
	<20070209075401.GA9697@caltech.edu>
	<5.1.1.6.0.20070209120902.038b7e20@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070305163900.02a48088@sparrow.telecommunity.com>

At 09:56 AM 2/9/2007 -0800, Titus Brown wrote:
>On Fri, Feb 09, 2007 at 12:10:00PM -0500, Phillip J. Eby wrote:
>-> Yeah, multiprocess should probably be set false there, and
>-> multithreadedness should depend on whether the ThreadingTCPServer or
>-> whatever it's called is mixed in.  (HTTPServer does in fact support this,
>-> but it's not tested in a WSGI context as far as I know.)
>
>OK.  Err, do you want a patch? ;)

Not really, but I'll take one anyway.  :)


From jtate at rpath.com  Mon Mar  5 23:23:09 2007
From: jtate at rpath.com (Joseph Tate)
Date: Mon, 5 Mar 2007 17:23:09 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45EC8952.1040703@colorstudy.com>
References: <45E8EB97.6090805@zetaweb.com> <200703051254.57032.jtate@rpath.com>
	<45EC8952.1040703@colorstudy.com>
Message-ID: <200703051723.09795.jtate@rpath.com>

On Monday 05 March 2007 16:19:14 Ian Bicking wrote:
> Joseph Tate wrote:
> > I find that multiple files gives you a nice way to override defaults.  As
> > long as the files are read in a way that's predictable and documentable,
> > and ultimately appear as if read from a single file (and possible
> > displayable via some diagnostics link in an application).
>
> Allowing this sort of thing means that the application carries around a
> complete config object of some sort, which I rather dislike -- it allows
> for smart applications, but it makes it much harder to understand the
> configuration and any possible side effects.  If we resolve the
> configuration down to something more limited (as the Paste Deploy entry
> points do) you can't really reconstruct the config from there.
> *Something* could still reconstruct the config (an alternate config
> loader, via logs, via debug settings, etc), just not the application
> itself.
>
> This is somewhat problematic for applications that have particularly
> complex config requirements, or want to support self-configuration.  The
> best solution that I can think of with Paste Deploy in that case is to
> just use the Paste Deploy configuration to point to the "real"
> configuration.

I agree.  That's why my app has a /config link that spits out the "effective" 
configuration.  The overridden config is a hard requirement, I'd love to hear 
alternative solutions.  /etc/php.d, /etc/httpd/conf.d and that ilk come to 
mind as examples of this kind of thing.

-- 
Joseph Tate
Software Engineer
rPath Inc.
http://www.rpath.com/rbuilder/
(919) 851-3984 x2106

From jtate at rpath.com  Tue Mar  6 03:46:46 2007
From: jtate at rpath.com (Joseph Tate)
Date: Mon, 5 Mar 2007 21:46:46 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
References: <45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
Message-ID: <200703052146.46699.jtate@rpath.com>

On Monday 05 March 2007 16:38:51 Phillip J. Eby wrote:
> At 10:02 AM 3/5/2007 -0500, Jim Fulton wrote:
> >Entry points add *a* mechanism to make those objects a bit more
> >discoverable.  Arguably, specifying an application via:
> >eggname#entrypointname doesn't provide much advantage over simply
> >specifying the dotted path to an object in a module.
>
> Actually, it provides one very important strategic advantage that I don't
> think has been mentioned in this conversation.  A configuration format that
> can specify project/version information can be used as a single-file
> deployment spec for an easy_install wrapper or buildout-like tool.
>
> The advantage of this for virtual hosting providers in particular is
> significant -- if they support the tool, they can support this one-file
> deployment scheme.
>
> Personally, I don't care for the Paste Deploy syntax -- frankly it's almost
> barbaric.  :)  But the concept of being able to specify stacks, routes, and
> configuration in a plain text format that includes package information for
> automated deployment is nonetheless an important one.
>
> A couple years back, I started writing a library to parse a more
> sophisticated, Python-like syntax to do the same sorts of things, but only
> got as far as the parser.
>
> One discussion was here:
>
> http://mail.python.org/pipermail/web-sig/2005-August/001714.html
>
> The basic idea behind the syntax was that assignments are like keyword
> arguments, and non-assignment statements are positional arguments.
>
> I'm not altogether happy with that syntax either, however, as it has a
> little too much "more than one way to do it", which is one reason I never
> finished the implementation.  There is a library that parses it (and does
> other general-purpose Python-like DSL parsing) at:
>
> ViewSVN:   http://svn.eby-sarna.com/SCALE/
> Checkout:  svn://svn.eby-sarna.com/svnroot/SCALE/
> Docs:
> http://peak.telecommunity.com/DevCenter/scale.dsl#parsing-declarations
>
> Anyway, all that aside, I think it would be fantastic if we could come up
> with some "universal file format" for single-file configuration and
> deployment of applications (including auto-install of all needed eggs),
> that could get stdlib support and ultimately hosting company support.  This
> would actually give us a leg up on even PHP for ease-of-deployment.

Doesn't setuptools already give this?  easy_install foo.app.egg will install 
all of the needed eggs if the dependencies are properly listed.

> So, while a single content format would be nice, we don't even need that in
> order to get a raw deployment system standard.  Perhaps I should build this
> hypothetical pydeploy tool into setuptools 0.7?

I don't see there being a lot of demand for this.  The use case I'm 
considering is the end user developer or administrator deploying one or more 
delivered pyhon web applications to a production environment (self hosted, 
colo-hosted, or leased server).  I think that except for where you have 
multiple servers behind a load balancer or something, this is a one time 
operation (barring failure cases, etc).  Administrators already script this 
kind of thing using shell.

Also, in any "enterprise" environment that I'm familiar with, the 
automatically download and install software mechanism wouldn't fly.   
Administrators want to know everything that goes on a system, and want the 
software managed through their patch/package management system.  
Philosophical discussions on whether that's good or not seem to be 
irrelevant.

Those using $4.95 hosting plans are only setting up one server, and will need 
something custom to their installation anyway, so "pydeploy" won't help them 
either.  They'll be trying to install trac, some blogging software and then 
an arbitrary image gallery, et. al., but won't have the same selections as 
another $4.95 hosting customer.  This is the key problem we're trying to 
solve.

I consider the packaging and delivery problem solved[1], or at least out of 
the scope of this problem.

-- 
Joseph Tate
Software Engineer
rPath Inc.
http://www.rpath.com/rbuilder/
(919) 851-3984 x2106

[1] Good enough for most things but better support for stuff outside the egg 
is needed: config files (so that the user can tweak them), locale data 
(or maybe a pkg_resources wrapper for gettext that loads that data from the 
egg).

From pje at telecommunity.com  Tue Mar  6 04:25:27 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 05 Mar 2007 22:25:27 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <200703052146.46699.jtate@rpath.com>
References: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070305222325.02812190@sparrow.telecommunity.com>

At 09:46 PM 3/5/2007 -0500, Joseph Tate wrote:
>Those using $4.95 hosting plans are only setting up one server, and will need
>something custom to their installation anyway, so "pydeploy" won't help them
>either.  They'll be trying to install trac, some blogging software and then
>an arbitrary image gallery, et. al., but won't have the same selections as
>another $4.95 hosting customer.  This is the key problem we're trying to
>solve.

I was saying that they would drop in a single file for trac, a single file 
for a blog, one for an image gallery, etc.  That's a heck of a big 
deployment advantage, actually.

I wasn't talking about configuring a "server" -- I was talking about 
deploying *applications*.


From chris at simplistix.co.uk  Tue Mar  6 20:59:54 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 06 Mar 2007 19:59:54 +0000
Subject: [Web-SIG] The importance of deploying Python-based web apps on
 Windows (was: Re: [Proposal] "website" and first-level conf)
In-Reply-To: <a7a2b76b0703051042y46eb7d2bx3d4834f7d9866cdd@mail.gmail.com>
References: <a7a2b76b0703051042y46eb7d2bx3d4834f7d9866cdd@mail.gmail.com>
Message-ID: <45EDC83A.1050806@simplistix.co.uk>

Sidnei da Silva wrote:
> I seriously hope you are kidding.
> 
> Unfortunately that's not possible. A lot of people, specially when
> evaluating open-source projects, have their first contact with the
> software through the Windows platform. To quote some numbers, the
> Plone Installer for Windows has roughly 3x more downloads than any of
> the second most download package [1].
> 
> Now, I see clearly two options for open-source projects: have a
> Windows story and increase your downloads by X%, where X can be a
> number between 50-300 *wink*, or not have a Windows story and relying
> on the *nix crowd to be the sole consumers of your software.
> 
> When you talk to a big organization that is already deploying their
> applications on the Windows platform what story you want to tell them?
> 'Oh, and by the way, all your investment on Windows software, you will
> have to throw all that away if you want to use our software'. Good
> luck with that.
> 
> I think that it's pretty important that Python-based web apps have as
> good of a story on Windows as it has in other fields (pywin32 comes to
> mind) but feel free to disagree.
> 
> Sorry for the rant.

No, and this really deserves saying again...

Windows isn't going to vanish any time soon, and we're not going to help 
it vanish any quicker by head-in-sand'ing it's existence...

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

From chris at simplistix.co.uk  Tue Mar  6 20:56:34 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 06 Mar 2007 19:56:34 +0000
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
Message-ID: <45EDC772.3090803@simplistix.co.uk>

Jim Fulton wrote:
> On Mar 3, 2007, at 11:27 PM, Chad Whitacre wrote:
> ...
>> Now, Jim: it looks like Zope still uses a Unix-y userland for
>> INSTANCE_HOME, yes?
> 
> Yes, but I hate it.  At Zope Corporation, We're moving away from it  
> for a number of reasons.

I actually like it a lot, still, and I haven't heard compelling 
arguments, for me, for other things...

The big plus point for me is that everything needed for one deployment 
is in one folder.

I agree with Jim that in large-scale deployments, as ZC does, there may 
not be the need to worry about this, but I think python is probably in 
use in a lot more projects where there's more than one project per 
machine, and you want to be able to totally isolate them from each other.

INSTANCE_HOME in Zope 2 felt like the right balance for me...

> For development, it adds structure that isn't needed.  A Zope  
> instance really only needs a few files.  Trying to minic some  
> notional unix layout just adds pointless structure.

It's kindof self documenting though:

/etc -> config
/bin -> scripts
/var -> data
/log -> logs

I like that consistency, regardless of its origins...

> The traditional complex Zope instance file layout lead to the use of  
> an instance "skeleton" to deal with all of the files, which led, in  
> turn, to a copy and hack style of configuration customization that is  
> inflexible and encourages cruft.

I think the Zope 3 skeletons went the wrong way. The skeletons work, but 
where they only contain config that's specific to that instance. Zope 
3's notions of putting python scripts (and non-trivial ones at that!) 
and the like into the instance home made me shudder...

> For production deployments, we (Zope Corporation) install files into  
> the *real* Unix tree where site administrators want them. 

Not everyone runs on unix. Having a standard layout that fits into a 
folder works cross platform to a large extent.

> Keeping the number of files used by an application minimal makes it  
> easier deal with the different needs of development and deployment  
> and makes it easier, at least for me, to deal with different  
> configurations.

Yep.

> I'll note that I find lib/python especially silly. 

Agreed. lib would be fine, mindyou, so would Products ;-)

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

From ianb at colorstudy.com  Wed Mar  7 03:08:46 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 06 Mar 2007 20:08:46 -0600
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking Middleware
Message-ID: <45EE1EAE.50705@colorstudy.com>

Posted here: http://wsgi.org/wsgi/Specifications/avoiding_serialization

Text copied below for discussion:


:Title: Avoiding Serialization When Stacking Middleware
:Author: Ian Bicking <ianb at colorstudy.com>
:Discussions-To: Python Web-SIG <web-sig at python.org>
:Status: Proposed
:Created: 06-03-2007

.. contents::

Abstract
--------

This proposal gives a strategy for avoiding unnecessary serialization 
and deserialization of request and response bodies.  It does so by 
attaching attributes to ``wsgi.input`` and the ``app_iter``, as well as 
a new environment key ``x-wsgiorg.want_parsed_response``.

Rationale
---------

Output-transforming middleware often has to parse the upstream content, 
transform it, then serialize it back to a string for output.  The 
original output may have already been in the parsed form that the 
middleware wanted.  Or there may be more middleware that does similar 
transformations on the same kind of objects.

The same things apply to the parsing of ``wsgi.input``, specifically 
parsing form data.  A similar strategy is presented to avoid 
unnecessarily reparsing that data.

Specification
-------------

WSGI applications (or middleware) can return an app_iter that not only 
serializes the output, but also has extra attributes.  An attribute is 
given here, ``app_iter.x_wsgiorg_parsed_response`` which is a 
function/method that takes one argument, the "type" of object that you 
want to receive.  It may return that type of object, or None (meaning it 
cannot produce that type of object).  Consumers should fall back on 
normal parsing of the response if the method does not exist, or returns 
None.

Similarly the ``environ['wsgi.input']`` object may have the same method, 
with the same meaning.

WSGI applications that want to lazily serialize their output have a 
problem: they probably cannot calculate ``Content-Length`` without doing 
the actual serialization.  Browsers typically want to know about 
``Content-Length``, but WSGI middleware seldom cares, since it just can 
get the content from app_iter regardless of its length.  WSGI middleware 
that will transform the output can set 
``environ['x-wsgiorg.want_parsed_response'] = True`` to give this hint 
to the application.  Applications are thus encouraged to only lazily 
serialize their output when that key is present and true.  (There is no 
equivalent concept for ``wsgi.input``.)

The object returned by ``.x_wsgiorg_parsed_response()`` may be modified 
in-place by the WSGI middleware using that object.  Producers should 
make a copy if they do not want consumers modifying the object.

Example
--------

Two examples are provided: one for output, and one for input.

The output transformation parses the page with ``lxml.etree.HTML`` (from 
the `lxml <http://codespeak.net/lxml/>`_ library) and replaces all 
``<i>`` tags with ``<em>`` tags.  First we show the middleware::

     import lxml.etree

     class EmTagMiddleware(object):
         def __init__(self, app):
             self.app = app
         def __call__(self, environ, start_response):
             parent_wants_parsed = 
environ.get('x-wsgiorg.want_parsed_response')
             environ['x-wsgiorg.want_parsed_response'] = True
             written_output = []
             captured_headers = []
             def repl_start_response(status, headers, exc_info=None):
                 if exc_info:
                     raise exc_info[0], exc_info[1], exc_info[2]
                 captured_headers[:] = [status, headers]
                 return written_output.append
             app_iter = self.app(environ, repl_start_response)
             parsed = None
             if captured_headers and not written_output:
                 method = getattr(app_iter, 'x_wsgiorg_parsed_response', 
None)
                 if method:
                     parsed = method(lxml.etree._Element)
             if parsed is None:
                 # Have to manually parse, because:
                 #  a) start_response was called lazily
                 #  b) the start_response writer was used
                 #  c) app_iter.x_wsgiorg_parsed_response didn't exist
                 #  d) that method returned None
                 try:
                     for item in app_iter:
                         written_output.append(item)
                 finally:
                     if hasattr(app_iter, 'close'):
                         app_iter.close()
                 parsed = self.parse_body(''.join(written_output))
             status, headers = captured_headers
             new_body = self.transform_body(parsed)
             for i in range(len(headers)):
                 if headers[i][0].lower() == 'content-length':
                     del headers[i]
                     break
             if parent_wants_parsed:
                 new_app_iter = self.make_app_iter(new_body)
             else:
                 serialized_body = serialize(new_body)
                 headers.append(('Content-Length', 
str(len(serialized_body))))
                 new_app_iter = [serialized_body]
             return new_app_iter

         def parse_body(self, body):
             return lxml.etree.HTML(body)

         def transform_body(self, root):
             for el in root.xpath('//i'):
                 el.tag = 'em'
             return root

         def make_app_iter(self, body):
             return LazyLXML(body)

     def serialize(element):
         return lxml.etree.tostring(element)

     class LazyLXML(object):
         def __init__(self, body):
             self.body = body
             self.have_yielded = False
         def __iter__(self):
             return self
         def next(self):
             if self.have_yielded:
                 raise StopIteration
             self.have_yielded = True
             return serialize(self.body)
         def x_wsgiorg_parsed_response(self, type):
             if type is lxml.etree._Element:
                 return self.body
             return None

Here's a simpler example for parsing normal form inputs in ``wsgi.input``::

     import cgi
     import urllib
     from cStringIO import StringIO

     def parse_form(environ):
         content_type = environ.get('CONTENT_TYPE', '')
         assert content_type in ['application/x-www-form-urlencoded', 
'multipart/form-data']
         wsgi_input = environ['wsgi.input']
         method = getattr(wsgi_input, 'x_wsgiorg_parsed_response', None)
         if method:
             parsed = method(cgi.FieldStorage)
             if parsed is not None:
                 return parsed
         form = cgi.FieldStorage(fp=wsgi_input, environ=environ, 
keep_blank_values=True)
         environ['wsgi.input'] = FakeFormInput(form)
         return form

     class FakeFormInput(object):
         def __init__(self, form):
             self.form = form
             self.serialized = None
         def x_wsgiorg_parsed_response(self, type):
             if type is cgi.FieldStorage:
                 return self.form
             return None
         def read(self):
             if self.serialized is None:
                 self._serialize()
             return self.serialized.read()
         def readline(self, *args):
             if self.serialized is None:
                 self._serialize()
             return self.serialized.readline(*args)
         def readlines(self, *args):
             if self.serialized is None:
                 self._serialize()
             return self.serialized.readlines(*args)
         def __iter__(self):
             if self.serialized is None:
                 self._serialize()
             return iter(self.serialized)
         def _serialize(self):
             # XXX: Doesn't deal with file uploads, and 
multipart/form-data generally
             data = urllib.urlencode(self.form.list, True)
             self.serialized = StringIO(data)

Problems
--------

Obviously the code is not simple, but this is the nature of WSGI 
output-transforming middleware.  Ideally a framework of some sort would 
be used to construct this kind of middleware.

Something that replaces ``wsgi.input`` (like the example) may change the 
``CONTENT_LENGTH`` of the request; normalization alone may change the 
length, even if the data is the same (e.g., there are multiple ways to 
urlencode a string).  However, there's no way without actually 
serializing to determine the proper length.  Ideally requests like this 
should allow simply reading to the end of the object, without needing a 
``CONTENT_LENGTH`` restriction (this is not true for socket objects). 
Ideally something like ``CONTENT_LENGTH="-1"`` would indicate this 
situation (simply a missing ``CONTENT_LENGTH`` generally means ``0``). 
Another option is to set it to 1 and simply return the entire serialized 
response all at once.  ``cgi.FieldStorage`` actually protects against 
this.  Or set it to a very very large value, and allow reading past the 
end (returning ``""``).  This is likely to work with most consumers. 
I'm not sure what effect -1 will have on different code.

Other Possibilities
-------------------

* You could simply parse everything ever time.
* You could pass data through callbacks in the environment (but this can 
break non-aware middleware).
* You can make custom methods and keys for each case.
* You can use something other than WSGI.

I think this specification offers advantages over all these options.

Open Issues
-----------

Should "type" be the class object?  A string describing the type? 
Things like ``lxml.etree._Element`` are a little unclean, since the 
*actual* class isn't a public object (only the factory function 
``lxml.etree.Element``).  Also, there are occasionally times when 
multiple classes implement the same interface.

The boolean ``environ['x-wsgiorg.want_parsed_response']`` doesn't really 
give any idea of what *kind* of object you want.  This is actually 
something of a problem, because sometimes it's impossible to give that 
kind of object.  For instance, if you want to transform images you might 
want the PIL object for the image.  But if the response is HTML there's 
no way to give this type.  Similarly if you are transforming HTML then 
images don't mean anything to you, and you probably *do* want them to 
come out as normal.  And potentially *both* a image transformer and an 
HTML transformer are in the stack.  Should that key actually hold a list 
of types that are of interest?

``x_wsgiorg_parsed_response`` isn't a very good name for the method on 
``wsgi.input``, as it's not a response.

From pje at telecommunity.com  Wed Mar  7 03:52:20 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 06 Mar 2007 21:52:20 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
 Middleware
In-Reply-To: <45EE1EAE.50705@colorstudy.com>
Message-ID: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>

At 08:08 PM 3/6/2007 -0600, Ian Bicking wrote:
>Posted here: http://wsgi.org/wsgi/Specifications/avoiding_serialization
>
>Text copied below for discussion:
>
>
>:Title: Avoiding Serialization When Stacking Middleware
>:Author: Ian Bicking <ianb at colorstudy.com>
>:Discussions-To: Python Web-SIG <web-sig at python.org>
>:Status: Proposed
>:Created: 06-03-2007
>
>.. contents::
>
>Abstract
>--------
>
>This proposal gives a strategy for avoiding unnecessary serialization
>and deserialization of request and response bodies.  It does so by
>attaching attributes to ``wsgi.input`` and the ``app_iter``, as well as
>a new environment key ``x-wsgiorg.want_parsed_response``.
>
>Rationale
>---------
>
>Output-transforming middleware often has to parse the upstream content,
>transform it, then serialize it back to a string for output.  The
>original output may have already been in the parsed form that the
>middleware wanted.  Or there may be more middleware that does similar
>transformations on the same kind of objects.

HTTP already includes a mechanism for specifying what types are accepted by 
a content consumer: the "Accept" header.  You can always add other values 
to it to indicate the parsed values you can accept.

Of course, this doesn't really work well with WSGI - you want the result to 
actually *be* WSGI...  so you can use the WSGI way of doing this, which is 
to have a standard wrapper for the specific content type you want to use.

The wrapper (as with the wsgi "file wrapper") simply puts a WSGI face on a 
non-WSGI result body, converting it to an iterator of strings, and holding 
other attributes known to the middleware or other application object.

This could be implemented as an environ key containing a mapping from types 
to wrapper functions.  Middleware that wants a type just copies the mapping 
and overwrites any entries it cares about.  Applications that want to 
return a non-serialized result just look up the type (using __mro__ order) 
to find an applicable wrapper.

Notice that this approach doesn't require any special protocol for these 
wrappers -- just WSGI.  It's simpler to specify, and simpler to implement 
than what you propose, while addressing some of the open issues.

Yes, it does have some problems with interface vs. implementation.  ISTM 
that trying to solve that problem is effectively asking to revive or 
reinvent PEP 246, however.  But we could explicitly allow the use of type 
names instead of the actual types.


>The same things apply to the parsing of ``wsgi.input``, specifically
>parsing form data.  A similar strategy is presented to avoid
>unnecessarily reparsing that data.

I would rather offer an optional 'get_file_storage()' method or some such 
as a blessed WSGI extension, than have such an open-ended "get whatever you 
want from the input object" concept floating around.  A strategy which 
reinvents half of PEP 246 (the *old* PEP 246, before it became almost as 
complicated as WSGI) seems like overkill to me.


>Obviously the code is not simple, but this is the nature of WSGI
>output-transforming middleware.

Something I'd like to fix in WSGI 2.0, by getting rid of both 
"start_response" and "write", but that's a discussion for another time.


>Other Possibilities
>-------------------
>
>* You could simply parse everything ever time.
>* You could pass data through callbacks in the environment (but this can
>break non-aware middleware).
>* You can make custom methods and keys for each case.
>* You can use something other than WSGI.

And you can use the established WSGI method for adding semantics to a 
response, using a middleware-supplied wrapper.  I think this is actually 
the best alternative.

In truth, it could be as simple as using the class's fully-qualified name 
as an environ key (perhaps with a prefix or suffix), with the value being a 
wrapper for objects implementing that protocol.  No 
x-foobar-wsgiorg-whatchamacallit cruft needed.

And, it's lightweight enough of a concept to be expressed as a simple "best 
practice" design pattern.


From fumanchu at amor.org  Wed Mar  7 04:23:17 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 6 Mar 2007 19:23:17 -0800
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
	Middleware
References: <45EE1EAE.50705@colorstudy.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A86224D55@ex9.hostedexchange.local>

Ian Bicking wrote:
> This proposal gives a strategy for avoiding unnecessary
> serialization and deserialization of request and response
> bodies.  It does so by attaching attributes to ``wsgi.input``
> and the ``app_iter``, as well as a new environment key
> ``x-wsgiorg.want_parsed_response``.
> 
> [snip]
> 
> for item in app_iter:
>     written_output.append(item)

This bit of the example, at least, is not compliant with PEP 333:
http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries

"To put this requirement another way, a middleware component
must yield at least one value each time its underlying
application yields a value. If the middleware cannot yield
any other value, it must yield an empty string."

I suspect rewriting the example to conform to PEP 333 will make this proposal much more complex?


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20070306/28095a2d/attachment-0001.htm 

From ianb at colorstudy.com  Wed Mar  7 04:43:43 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 06 Mar 2007 21:43:43 -0600
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
	Middleware
In-Reply-To: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
Message-ID: <45EE34EF.9030602@colorstudy.com>

Phillip J. Eby wrote:
> At 08:08 PM 3/6/2007 -0600, Ian Bicking wrote:
>> Posted here: http://wsgi.org/wsgi/Specifications/avoiding_serialization
>>
>> Text copied below for discussion:
>>
>>
>> :Title: Avoiding Serialization When Stacking Middleware
>> :Author: Ian Bicking <ianb at colorstudy.com>
>> :Discussions-To: Python Web-SIG <web-sig at python.org>
>> :Status: Proposed
>> :Created: 06-03-2007
>>
>> .. contents::
>>
>> Abstract
>> --------
>>
>> This proposal gives a strategy for avoiding unnecessary serialization
>> and deserialization of request and response bodies.  It does so by
>> attaching attributes to ``wsgi.input`` and the ``app_iter``, as well as
>> a new environment key ``x-wsgiorg.want_parsed_response``.
>>
>> Rationale
>> ---------
>>
>> Output-transforming middleware often has to parse the upstream content,
>> transform it, then serialize it back to a string for output.  The
>> original output may have already been in the parsed form that the
>> middleware wanted.  Or there may be more middleware that does similar
>> transformations on the same kind of objects.
> 
> HTTP already includes a mechanism for specifying what types are accepted 
> by a content consumer: the "Accept" header.  You can always add other 
> values to it to indicate the parsed values you can accept.
> 
> Of course, this doesn't really work well with WSGI - you want the result 
> to actually *be* WSGI...  so you can use the WSGI way of doing this, 
> which is to have a standard wrapper for the specific content type you 
> want to use.

Yeah, using Accept is clever, but not really accurate, since if you 
serialize the WSGI request to HTTP the addition no longer makes sense.

> The wrapper (as with the wsgi "file wrapper") simply puts a WSGI face on 
> a non-WSGI result body, converting it to an iterator of strings, and 
> holding other attributes known to the middleware or other application 
> object.

That just calls for a series of ad hoc techniques, basically, where each 
object type results in a new key in the environment and a new ad hoc 
specification to be made (e.g., wsgi.file_wrapper takes a block size, 
which is specific only to that case).

> This could be implemented as an environ key containing a mapping from 
> types to wrapper functions.  Middleware that wants a type just copies 
> the mapping and overwrites any entries it cares about.  Applications 
> that want to return a non-serialized result just look up the type (using 
> __mro__ order) to find an applicable wrapper.

OK, the dict would avoid multiple different kinds of keys, and 
presumably they'd all have the same signature.  Block size doesn't 
really make any sense to me as a common parameter.  Content type should 
be a common parameter, as something like an lxml object can be 
serialized as either XML or HTML.  I don't think any response headers 
are likely to effect the serialization... though with my specification 
that remains an application concern, so it doesn't have to be resolved 
in the specification.

I hadn't really thought about MRO, though generally I don't trust 
inheritance to be meaningful anyway -- I feel like I'd be more likely to 
a switch on the type than test inheritance.

> Notice that this approach doesn't require any special protocol for these 
> wrappers -- just WSGI.  It's simpler to specify, and simpler to 
> implement than what you propose, while addressing some of the open issues.

The specification isn't particularly long or complicated, IMHO.  The 
implementation is complicated mostly for reasons unrelated to the 
specification -- any output-transforming middleware will be similarly 
complicated.

> Yes, it does have some problems with interface vs. implementation.  ISTM 
> that trying to solve that problem is effectively asking to revive or 
> reinvent PEP 246, however.  But we could explicitly allow the use of 
> type names instead of the actual types.

When playing with implementation I used type names, and actually I 
rather prefer them, but it's not always clear what name to use.  For 
instance, "lxml", "lxml.etree", "lxml.etree.Element", and 
"lxml.etree._Element" all are reasonable names.  Or "ElementTree", 
"ElementTree.Element", "ElementTree._Element", "xml.etree", 
"xml.etree.Element", and "xml.etree._Element".  Or even something like 
"IElement" could make sense in some context (e.g., what if you can 
accept the overlapping interfaces of both lxml and ElementTree?)

At least the actual type object seems easy enough.  OTOH, there are 
actually cases when I'd like to say that I could accept a certain type 
without having to import the type.  E.g., if I wanted to do an XSLT 
transformation, I *could* support several kinds of objects without 
requiring any of them (e.g., lxml, 4DOM, and Genshi Markup).

>> The same things apply to the parsing of ``wsgi.input``, specifically
>> parsing form data.  A similar strategy is presented to avoid
>> unnecessarily reparsing that data.
> 
> I would rather offer an optional 'get_file_storage()' method or some 
> such as a blessed WSGI extension, than have such an open-ended "get 
> whatever you want from the input object" concept floating around.  A 
> strategy which reinvents half of PEP 246 (the *old* PEP 246, before it 
> became almost as complicated as WSGI) seems like overkill to me.

I don't really understand what you are proposing.  This part addresses 
the same issues as presented in 
http://wsgi.org/wsgi/Specifications/handling_post_forms

I really don't *want* to write every wsgi.input to a temporary file just 
because someone else *might* want to reparse the input.  I'd much rather 
do it lazily, as 99% of the time reparsing won't happen.

>> Obviously the code is not simple, but this is the nature of WSGI
>> output-transforming middleware.
> 
> Something I'd like to fix in WSGI 2.0, by getting rid of both 
> "start_response" and "write", but that's a discussion for another time.

Yeah, that'd be nice, but another discussion for another time.

>> Other Possibilities
>> -------------------
>>
>> * You could simply parse everything ever time.
>> * You could pass data through callbacks in the environment (but this can
>> break non-aware middleware).
>> * You can make custom methods and keys for each case.
>> * You can use something other than WSGI.
> 
> And you can use the established WSGI method for adding semantics to a 
> response, using a middleware-supplied wrapper.  I think this is actually 
> the best alternative.

I really don't understand the advantage.

> In truth, it could be as simple as using the class's fully-qualified 
> name as an environ key (perhaps with a prefix or suffix), with the value 
> being a wrapper for objects implementing that protocol.  No 
> x-foobar-wsgiorg-whatchamacallit cruft needed.
> 
> And, it's lightweight enough of a concept to be expressed as a simple 
> "best practice" design pattern.

Best practice is fine, though of course still needs to be documented, as 
this is hardly a practice that people would naturally think about or 
implement.  But I don't really think that practice would be any simpler 
or easier to describe if done completely.  In fact, I think it would 
take exactly the same amount of space to describe.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From pje at telecommunity.com  Wed Mar  7 05:51:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 06 Mar 2007 23:51:39 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
 Middleware
In-Reply-To: <45EE34EF.9030602@colorstudy.com>
References: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>

At 09:43 PM 3/6/2007 -0600, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>The wrapper (as with the wsgi "file wrapper") simply puts a WSGI face on 
>>a non-WSGI result body, converting it to an iterator of strings, and 
>>holding other attributes known to the middleware or other application object.
>
>That just calls for a series of ad hoc techniques,

As is appropriate for a "series of tubes".  :)

>  basically, where each object type results in a new key in the 
> environment and a new ad hoc specification to be made (e.g., 
> wsgi.file_wrapper takes a block size, which is specific only to that case).

Right.  I'm specifically saying that a collection of individual 
specifications is much *better* than a single overarching specification 
generalized from a single example.  Single use cases make bad general specs.


>OK, the dict would avoid multiple different kinds of keys, and presumably 
>they'd all have the same signature.  Block size doesn't really make any 
>sense to me as a common parameter.  Content type should be a common 
>parameter, as something like an lxml object can be serialized as either 
>XML or HTML.  I don't think any response headers are likely to effect the 
>serialization... though with my specification that remains an application 
>concern, so it doesn't have to be resolved in the specification.

Please don't keep trying to generalize this.  They're called 
"specific-ations", not "general-izations".  :)


>>Notice that this approach doesn't require any special protocol for these 
>>wrappers -- just WSGI.  It's simpler to specify, and simpler to implement 
>>than what you propose, while addressing some of the open issues.
>
>The specification isn't particularly long or complicated, IMHO.

That's because it doesn't address any of the real issues -- they're all 
deferred to your "open issues" section.  That's why I don't think having 
the specification adds any value over highlighting the existing WSGI 
pattern for extending the response (i.e. server-supplied iterator-wrappers).


>When playing with implementation I used type names, and actually I rather 
>prefer them, but it's not always clear what name to use.  For instance, 
>"lxml", "lxml.etree", "lxml.etree.Element", and "lxml.etree._Element" all 
>are reasonable names.  Or "ElementTree", "ElementTree.Element", 
>"ElementTree._Element", "xml.etree", "xml.etree.Element", and 
>"xml.etree._Element".  Or even something like "IElement" could make sense 
>in some context (e.g., what if you can accept the overlapping interfaces 
>of both lxml and ElementTree?)
>
>At least the actual type object seems easy enough.  OTOH, there are 
>actually cases when I'd like to say that I could accept a certain type 
>without having to import the type.  E.g., if I wanted to do an XSLT 
>transformation, I *could* support several kinds of objects without 
>requiring any of them (e.g., lxml, 4DOM, and Genshi Markup).

These problems all stem from premature generalization.  It's a trivial 
problem to fix, however, if you are trying to share one particular content 
type: just pick a key and use it!

Libraries such as wsgiref can support this pattern by providing a utility 
like "wrap_content(environ, content, default_wrapper, *keys)" function that 
looks up "keys" to find a wrapper to use in place of the default_wrapper.


>>>The same things apply to the parsing of ``wsgi.input``, specifically
>>>parsing form data.  A similar strategy is presented to avoid
>>>unnecessarily reparsing that data.
>>I would rather offer an optional 'get_file_storage()' method or some such 
>>as a blessed WSGI extension, than have such an open-ended "get whatever 
>>you want from the input object" concept floating around.  A strategy 
>>which reinvents half of PEP 246 (the *old* PEP 246, before it became 
>>almost as complicated as WSGI) seems like overkill to me.
>
>I don't really understand what you are proposing.

That wsgi.input be allowed to have a 'get_file_storage()' method that can 
be called by applications, and that calling it means the input stream must 
not have been read and will no longer be readable.


>This part addresses the same issues as presented in 
>http://wsgi.org/wsgi/Specifications/handling_post_forms
>
>I really don't *want* to write every wsgi.input to a temporary file just 
>because someone else *might* want to reparse the input.  I'd much rather 
>do it lazily, as 99% of the time reparsing won't happen.

I don't understand your complaint, as it seems unrelated to what I propose.


>>>Other Possibilities
>>>-------------------
>>>
>>>* You could simply parse everything ever time.
>>>* You could pass data through callbacks in the environment (but this can
>>>break non-aware middleware).
>>>* You can make custom methods and keys for each case.
>>>* You can use something other than WSGI.
>>And you can use the established WSGI method for adding semantics to a 
>>response, using a middleware-supplied wrapper.  I think this is actually 
>>the best alternative.
>
>I really don't understand the advantage.

It's simple: *specifications are a liability in the general case*.  They 
are supposed to be the record of negotiations between people who need to 
co-operate, not an attempt to solve all possible problems.

So, if your spec is only about how relatively tight-coupled WFC's (WSGI 
framework components) talk to each other, it seems more properly the 
business of a web framework, not WSGI.

However, it *is* WSGI (wsgi-onic?) for the authors of certain components to 
get together and say, "hey let's agree on this wrapper protocol"...  or 
better yet, a wrapper *implementation*.

This is way way better than having another spec.  Every godforsaken new 
spec attached to WSGI just makes the whole thing seem way too 
complicated.  In retrospect, I wish I hadn't supported some of the options 
and doodads and whatnots that are in WSGI today.  If I had it to do over, 
WSGI would be a lot simpler.

However, it's not too late to stop adding new cruft -- and I consider the 
idea of reinventing PEP 246 inside of WSGI to be cruft of a most horrible kind.


>Best practice is fine, though of course still needs to be documented, as 
>this is hardly a practice that people would naturally think about or implement.

Well, it's in PEP 333.


>   But I don't really think that practice would be any simpler or easier 
> to describe if done completely.  In fact, I think it would take exactly 
> the same amount of space to describe.

Even if it *did*, it'd still be better.  However, since it's not a spec, it 
can be presented informally.  Here's an example:

"If you want to give applications underneath your middleware a chance to 
return rich responses (i.e., objects instead of strings), follow the 
pattern used for the WSGI 'file wrapper' object.  That is, have your server 
or middleware add an environ key with a wrapper API that can convert the 
richer objects you're expecting into a standard WSGI iterator.  Then, your 
server can simply inspect the iterators it receives to see if they are 
instances of your wrapper type, and pull out the objects you want.  In this 
way, if there is middleware between you and the application returning the 
rich response that modifies the response body, you will receive an iterator 
of a different type, which you can process in the usual way.  However, if 
you receive an instance of your wrapper type, you will know that you can 
access the rich data directly."

Now, can you expand this into more of a tutorial, give more hints and so 
on?  Absolutely.  It'd be a great idea to.  But the basic idea is simple 
and doesn't require rigorous definitions -- it just needs people to publish 
what keys they're using and the *specifications thereof*.

What you're trying to specify is effectively a *meta*-specification: much 
more difficult to do well, and not nearly as useful to have in this case.


From jim at zope.com  Wed Mar  7 10:53:26 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 7 Mar 2007 04:53:26 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <200703051225.10896.jtate@rpath.com>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
	<200703051225.10896.jtate@rpath.com>
Message-ID: <001FA4CA-1923-481C-8363-8381B7B7D6CD@zope.com>


On Mar 5, 2007, at 12:25 PM, Joseph Tate wrote:

> On Saturday 03 March 2007 11:08:24 Jim Fulton wrote:
>>
>> Anyway, I share this for your consideration.  There are probably
>> better tools out there than zdaemon and supervisor2, but I'm not
>> aware of them. :)  I'm curious what other people have found or use.
>
> ll.daemon (http://www.livinglogic.de/Python/daemon/index.html)  
> seems to be a
> straightforward and very simple library for core daemon functionality.

Ah, this was the one mentioned in the open-space talk.  This looks  
very similar to a much earlier version of zdaemon.  A disadvantage I  
see with it is that it requires modifying a Python application to use  
it.  We moved away from that model with zdaemon, which can wrap any  
application.  We use it to make the spread daemon sane for example.  
Does ll.daemon provide a monitoring process that restarts an  
application process if it exits abnormally?

>
> Daemontools isn't very well respected by the SysV style initscript  
> crowd, and
> vice versa.  That's an external non python dependency, and not  
> commonly
> available.  Certainly not available on Windows.

Yes, I've heard similar things.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Wed Mar  7 11:04:47 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 7 Mar 2007 05:04:47 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <200703051225.10896.jtate@rpath.com>
References: <515038D2-29A5-498A-848E-8802C1963C91@zope.com>
	<200703051225.10896.jtate@rpath.com>
Message-ID: <36991237-260A-40C2-BFB4-23B201417E61@zope.com>


On Mar 5, 2007, at 12:25 PM, Joseph Tate wrote:
...
> ll.daemon (http://www.livinglogic.de/Python/daemon/index.html)  
> seems to be a
> straightforward and very simple library for core daemon functionality.
...
> I have written my own daemon base class (Pretty restrictive license
> [reciprocal], but I'm sure I could get it loosened).
> http://hg.rpath.com/raa-1.1?f=9ac380d082f4;file=raa/service/ 
> daemon.py I'm not
> married to it though, so would be happy to spin it out and remove  
> the conary
> requirements, or just junk it.

Are either of these useful on Windows?  IOW, do they map to services  
on windows?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Wed Mar  7 11:16:36 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 7 Mar 2007 05:16:36 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
Message-ID: <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>

On Mar 5, 2007, at 1:38 PM, Robert Brewer wrote:
...
> What several people have asked for is the ability to combine
> applications (and WSGI components) from a variety of frameworks into a
> single "website". What I'm proposing is that we standardize on a  
> set of
> topics/channels/events/signals that are "site-wide" events, like  
> start,
> stop, restart and graceful. If we collaborated on a tool to manage
> those, we could potentially make the codebases of each project  
> smaller,
> not just by removing the event manager, but by collaborating on a  
> set of
> standard event handlers, one of which could be a "daemonize me"  
> handler.

Agreed.

>
> What we have now:
>
>     CherryPy              Zope              Django
>     --------             ------             -------
>       ???                events             signals
>        |                    |                  |
>     autoreload             ???             autoreload
>        |                    |                  |
>     engine                zdrun               ???
>        |                    |                  |
>       ???                 zdctl               ???
>
> What we could have instead:
>
>                       webctl     modpython_gateway
>                          |           /
>          ------------ pywebd ------------
>         /                |               \
>     --------          ------           ------
>     CherryPy           Zope            Django
>
>
> ...where the "pywebd" module:
>
>  1. Composes the WSGI stack (provides a library to do so at least),
>  2. Notifies frameworks of site-wide events (like start, stop, restart
> and graceful),
>  3. Provides plugins that frameworks can "notify"; for example, adding
> files to an autoreload plugin.

This sounds great to me.

>> I think your "sitewide container" is the main program that loads
>> the WSGI components.  This might be Apache, if mod_python is
>> used, or some Python script/program.
>
> Apache itself is not going to be the chunk of code that loads the WSGI
> components. In my head, a modpython_gateway module (or something
> similar) would ask pywebd to do that.

Right.

>> I was discussing a tool that managed the main program in the
>> latter case. Something that started and restarted it, provided
>> status information, helped it to run as a proper daemon and so on.
>
> Sure, something like zdctl? But zdctl doesn't do the actual fork,  
> zdrun
> does...so what does "help run as a proper daemon" mean?

(zdrun is really an internal implementation detail of zdaemon.  The  
latest version of zdaemon hides this much more than earlier versions. )

Logically, zdctl runs zdrun, which forks and execs the application  
process. (In the latest version, there is just one script, zdaemon,  
that loads either the zdctl or zdrun entry point when it is run.)  
zdrun does the deamonizing steps:

  - disconnecting from the controlling terminal, and

  - changing to a different user if requested

before forking and execing the application.

I see a division of responsibilities between:

- A facility for managing an application process

   - start/stop/status/etc

   - passing environment variables, providing some logging support if  
necessary (especially for applications that spew to standard err/out).

   - Optionally providing other daemon behaviors like disconnecting  
from the controlling terminal, changing user, etc.  zdaemon provides  
this service on behalf of applications.

- A main program that provides common application-level services like  
the ones you describe above.

   - Optionally providing other daemon behaviors like disconnecting  
from the controlling terminal, changing user, etc.  ll.deamon  
provides some of these services within an application.

A question is whether to provide the daemonizing support in the main  
program or in the controlling program.  Note that in answering this  
question, we probably need to have an idea how this will work on  
windows.  If Unix-specific daemonizing code is in the main  
application, then the application won't be portable. Of course, if  
the main program is generic, it might not be a big deal to have  
separate versions for Windows and Unix.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Wed Mar  7 11:34:15 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 7 Mar 2007 05:34:15 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
References: <45E99DC1.4010703@zetaweb.com> <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
Message-ID: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>


On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote:
...
> Personally, I don't care for the Paste Deploy syntax -- frankly  
> it's almost barbaric.  :)

I don't mean to pick on you, but I really *hate* comments like this.   
I don't like softer forms like "complicated" or even "makes me  
uneasy".  It would be far more helpful if you provides specific  
criticism.  I'd appreciate it if we would all just ignore statements  
like this and, preferably, stop making them.

>   But the concept of being able to specify stacks, routes, and  
> configuration in a plain text format that includes package  
> information for automated deployment is nonetheless an important one.

Yes

> A couple years back, I started writing a library to parse a more  
> sophisticated, Python-like syntax to do the same sorts of things,  
> but only got as far as the parser.

A few years back, we created a library to parse more sophisticated  
apache-like syntax and I wish we hadn't.  The ini/config format is  
pretty standard and, IMO, really quite adequate.  I'm convinced that  
we don't really need another configuration format, at least not at  
this level.

...

> Anyway, all that aside, I think it would be fantastic if we could  
> come up with some "universal file format" for single-file  
> configuration and deployment of applications (including auto- 
> install of all needed eggs),

Me too. That's one of the reasons I created zc.buildout.  But that's  
a big commitment.  With buildout, I can use a single configuration  
file and have recipes that generate lots of little configuration  
files as necessary, for lots of applications like databases, ldap  
servers, and web applications that will never use a single  
configuration file on their own.  I'd be happy if we could tackle a  
simple configuration format that handled the kinds of things Paste  
Deployment handles now and maybe a little more.  I'll get my cake and  
eat it too with buildout. :)


> that could get stdlib support and ultimately hosting company  
> support.  This would actually give us a leg up on even PHP for ease- 
> of-deployment.

Aside from the universal configuration file issue, I think this would  
be a terrific thing for us to focus on.  Something I hear a lot is  
how much easier PHP applications are to deploy to hosting providers.   
I would *love* it is Python had a similar story, even if only for  
smaller applications.

I'd love to get some input who know a lot about what makes deploying  
PHP apps so easy.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Wed Mar  7 11:37:35 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 7 Mar 2007 05:37:35 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <200703051723.09795.jtate@rpath.com>
References: <45E8EB97.6090805@zetaweb.com> <200703051254.57032.jtate@rpath.com>
	<45EC8952.1040703@colorstudy.com>
	<200703051723.09795.jtate@rpath.com>
Message-ID: <153051AE-13FF-4D1F-860D-36F94A97A77D@zope.com>


On Mar 5, 2007, at 5:23 PM, Joseph Tate wrote:

> On Monday 05 March 2007 16:19:14 Ian Bicking wrote:
>> Joseph Tate wrote:
>>> I find that multiple files gives you a nice way to override  
>>> defaults.  As
>>> long as the files are read in a way that's predictable and  
>>> documentable,
>>> and ultimately appear as if read from a single file (and possible
>>> displayable via some diagnostics link in an application).
>>
>> Allowing this sort of thing means that the application carries  
>> around a
>> complete config object of some sort, which I rather dislike -- it  
>> allows
>> for smart applications, but it makes it much harder to understand the
>> configuration and any possible side effects.  If we resolve the
>> configuration down to something more limited (as the Paste Deploy  
>> entry
>> points do) you can't really reconstruct the config from there.
>> *Something* could still reconstruct the config (an alternate config
>> loader, via logs, via debug settings, etc), just not the application
>> itself.
>>
>> This is somewhat problematic for applications that have particularly
>> complex config requirements, or want to support self- 
>> configuration.  The
>> best solution that I can think of with Paste Deploy in that case  
>> is to
>> just use the Paste Deploy configuration to point to the "real"
>> configuration.
>
> I agree.  That's why my app has a /config link that spits out the  
> "effective"
> configuration.  The overridden config is a hard requirement, I'd  
> love to hear
> alternative solutions.  /etc/php.d, /etc/httpd/conf.d and that ilk  
> come to
> mind as examples of this kind of thing.

FWIW, zc.buildout has a configuration model designed to support  
overriding.  Often there is a base configuration that is overridden  
by specific configurations for development and deployment.  It  
leverages the beautifully simple model of a dictionary of  
dictionaries provided by ConfigParser.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Wed Mar  7 12:01:12 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 7 Mar 2007 06:01:12 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <45EDC772.3090803@simplistix.co.uk>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
Message-ID: <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>


On Mar 6, 2007, at 2:56 PM, Chris Withers wrote:

> Jim Fulton wrote:
>> On Mar 3, 2007, at 11:27 PM, Chad Whitacre wrote:
>> ...
>>> Now, Jim: it looks like Zope still uses a Unix-y userland for
>>> INSTANCE_HOME, yes?
>> Yes, but I hate it.  At Zope Corporation, We're moving away from  
>> it  for a number of reasons.
>
> I actually like it a lot, still, and I haven't heard compelling  
> arguments, for me, for other things...
>
> The big plus point for me is that everything needed for one  
> deployment is in one folder.

Having everything in one folder is great for development.  It isn't  
so good for deployment, at least not on Unix.  (I can think of lots  
of reasons why it wouldn't be great on Wndows either.)  For example,  
site administrators like to keep log files together and separate from  
other files.

Even if things are all together, there's really no point in having  
separate subdirectories, typically containing only one or 2 files,  
within the instance.  In a development instance, I'd much rather have  
a single directory containing the few needed files directly.  The  
only exception to this for me would be to have a subdirectory for  
Python modules, if you have instance specific Python modules.  Having  
to look in subdirectories for configuration and log files is just a  
pain.

...

>> For development, it adds structure that isn't needed.  A Zope   
>> instance really only needs a few files.  Trying to minic some   
>> notional unix layout just adds pointless structure.
>
> It's kindof self documenting though:
>
> /etc -> config
> /bin -> scripts
> /var -> data
> /log -> logs
>
> I like that consistency, regardless of its origins...

Bit without these, you have something like:

   zope.conf
   zopectl
   runzope
   debugzope
   scriptzope
   Data.fs
   zope.log

It is pretty clear that zope.conf is a configuration file, zope.log  
is a log file, and that Data.fs.  On Unix, It's pretty clear that the  
others are scripts, because they're executable and, on Windows, they  
should have .bat or .exe suffxes.

>> The traditional complex Zope instance file layout lead to the use  
>> of  an instance "skeleton" to deal with all of the files, which  
>> led, in  turn, to a copy and hack style of configuration  
>> customization that is  inflexible and encourages cruft.
>
> I think the Zope 3 skeletons went the wrong way. The skeletons  
> work, but where they only contain config that's specific to that  
> instance. Zope 3's notions of putting python scripts (and non- 
> trivial ones at that!) and the like into the instance home made me  
> shudder...

I'm not sure if you are referring to more than scripts.  I agree that  
we shouldn't have put utility scripts in instances.  I would argue  
that only the ctl script should go in instances.  The runzope,  
scriptzope, and debugzope scripts could be completely generic and  
invoked by an instance specific ctl script.  This is what I do in my  
latest Zope 3 buildout recipes.

Otherwise, Zope 2 and Zope 3 skeletons look pretty similar to me.

>> For production deployments, we (Zope Corporation) install files  
>> into  the *real* Unix tree where site administrators want them.
>
> Not everyone runs on unix. Having a standard layout that fits into  
> a folder works cross platform to a large extent.

Only for a particular definition of "works".  No experienced Unix  
administrator would say it works on Unix. I suspect that a  
professional Windows server adminstrator would have similar concerns.

...

My original point was not to advocate a particular layout but to  
point out that different layouts will be needed in different  
situations and that mandating a particular layout was likely to cause  
problems.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From ubernostrum at gmail.com  Wed Mar  7 13:08:13 2007
From: ubernostrum at gmail.com (James Bennett)
Date: Wed, 7 Mar 2007 06:08:13 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
Message-ID: <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com>

On 3/7/07, Jim Fulton <jim at zope.com> wrote:
> Aside from the universal configuration file issue, I think this would
> be a terrific thing for us to focus on.  Something I hear a lot is
> how much easier PHP applications are to deploy to hosting providers.
> I would *love* it is Python had a similar story, even if only for
> smaller applications.
>
> I'd love to get some input who know a lot about what makes deploying
> PHP apps so easy.

I've mostly been lurking because everybody here's quite a bit smarter
than I am on most of the issues discussed, but in a past life I had a
fair amount of experience working with and deploying PHP, so I'll
throw in my $0.02.

PHP is (or was, when I was doing it) "easy to deploy" largely because
of two things:

1. mod_php.
2. Baked-in database libraries.

Everybody already knows that web-server setup is a wart for Python
(and the discussion on that lately has been encouraging), so I won't
dwell on it except to say that I live for the day I'll be able to drop
my Apache -> mod_proxy -> lighttpd -> Unix socket -> FastCGI -> WSGI
-> Django setup (this on a "Python-friendly" shared host, no less) and
have a server configuration that's simpler than the blog app it runs.

The database issue is one that seems to get overlooked a bit, but is
also a killer. PHP gives you SQLite and MySQL support for free, and
Postgres is trivially easy to add if a host is offering Postgres
databases. Meanwhile, most hosts are still with Python 2.3 or 2.4, so
you don't even get SQLite out-of-the-box. The better ones will have
appropriate DB modules installed anyway, but that still seems to be
something of a crap shoot, and somebody who has to build their own
copy of mysqldb to use Python on their hosting account is somebody
who's not going to use Python on their hosting account.

I'm hoping that the ongoing framework hype will help a lot with the
database issue, though; a number of hosting companies right now seem
to be waking up and realizing that there's a lot of money to be made
from framework converts who need solid support for languages that
aren't PHP.

I'd say that if/when these two issues are overcome, or even made
slightly less nasty to deal with, there's not really anything else PHP
can compete on; WSGI and the ever-expanding range of kick-ass web
tools Python offers blow PHP out of the water. To take an easy
example, cruft-free URLs are still anywhere from tedious to nasty under
PHP; you have to fiddle with mod_rewrite, and every PHP project has
its own monolithic URL dispatch system. On the Python side, WSGI and
tools like Paste Deploy make it trivially easy to hang any app anywhere you
want it in your URL scheme.


And setting aside actual technical issues, I also think there's room
to work with documentation; going back to Jim's comment at the PyCon
frameworks panel about documentation that tells stories, it's worth
pointing out that a lot of the "PHP is easier" perception is largely
just that -- a perception -- and that various languages and tools, PHP
included, have compensated for some pretty nasty warts by telling
compelling stories (Rails certainly wouldn't be where it is today if
not for some great storytelling on the part of the people marketing
it). I'm sure we have plenty of good stories we could tell, and I'm
pretty sure we don't have as many warts :)


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

From zbynek.winkler at gmail.com  Wed Mar  7 13:50:16 2007
From: zbynek.winkler at gmail.com (Zbynek Winkler)
Date: Wed, 7 Mar 2007 13:50:16 +0100
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com>
Message-ID: <e120e9900703070450n41543234p2e46a8ab8c8a90a@mail.gmail.com>

On 3/7/07, James Bennett <ubernostrum at gmail.com> wrote:
> On 3/7/07, Jim Fulton <jim at zope.com> wrote:
> > Aside from the universal configuration file issue, I think this would
> > be a terrific thing for us to focus on.  Something I hear a lot is
> > how much easier PHP applications are to deploy to hosting providers.
> > I would *love* it is Python had a similar story, even if only for
> > smaller applications.
> >
> > I'd love to get some input who know a lot about what makes deploying
> > PHP apps so easy.
>
> I've mostly been lurking because everybody here's quite a bit smarter
> than I am on most of the issues discussed, but in a past life I had a
> fair amount of experience working with and deploying PHP, so I'll
> throw in my $0.02.
>
> PHP is (or was, when I was doing it) "easy to deploy" largely because
> of two things:
>
> 1. mod_php.
> 2. Baked-in database libraries.

And the fact of a really simple 'hello world' that just works. Python
is dead simple for cmdline apps (print "hello world") but not for
webapps. And the fact that deploying python app often consist of
configuring the whole "everything" (if not building from source, or
even finding on the web what exactly one needs for the particular
situation) does not really help either.

> Everybody already knows that web-server setup is a wart for Python
> (and the discussion on that lately has been encouraging), so I won't
> dwell on it except to say that I live for the day I'll be able to drop
> my Apache -> mod_proxy -> lighttpd -> Unix socket -> FastCGI -> WSGI
> -> Django setup (this on a "Python-friendly" shared host, no less) and
> have a server configuration that's simpler than the blog app it runs.

That is exactly what I meant :(

Zbynek Winkler

-- 
http://robotika.cz/

From sidnei at enfoldsystems.com  Wed Mar  7 14:42:00 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Wed, 7 Mar 2007 10:42:00 -0300
Subject: [Web-SIG] daemon tools
In-Reply-To: <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
Message-ID: <a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>

On Windows, the NT Service Controller does all the dirty job. And it's
pretty easy to write a service in Python that can run any application.
The simplest Python service is shorter than 30 lines I think.

Dealing with a service on Windows usually involves:

  - Registering/Unregistering the service
  - Setting service options
    - Startup type (automatic/manual/disabled)
    - Username (can be local machine or Active Directory,
      if the machine is on a domain)
    - Dependencies (a service can depend on other services)
    - Failure mode
      - There are 3 tries by default, you can customize
        what happens on each try
        - Ignore
        - Restart the Service
        - Run a program
        - Restart the computer

The service, after being registered can be managed with standard tools
present on the system:

C:\src>net stop bthserv
O servi?o de Bluetooth Support Service est? sendo finalizado .
O servi?o de Bluetooth Support Service foi finalizado com ?xito.


C:\src>net start bthserv
O servi?o de Bluetooth Support Service est? sendo iniciado.
O servi?o de Bluetooth Support Service foi iniciado com ?xito.

You can also use command-line tools to query the service status:

C:\src>sc \\pena queryex bthserv

SERVICE_NAME: bthserv
        TYPE               : 20  WIN32_SHARE_PROCESS
        STATE              : 4  RUNNING
                                (STOPPABLE,NOT_PAUSABLE,ACCEPTS_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
        PID                : 1372
        FLAGS              :

C:\src>sc \\pena queryex xmlprov

SERVICE_NAME: xmlprov
        TYPE               : 20  WIN32_SHARE_PROCESS
        STATE              : 1  STOPPED
                                (NOT_STOPPABLE,NOT_PAUSABLE,IGNORES_SHUTDOWN)
        WIN32_EXIT_CODE    : 1077       (0x435)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
        PID                : 0
        FLAGS              :

And that's just the tip of the iceberg. You can manage services on
other machines for example, still from the command line. You can query
service status with WMI, and you can interact with services from .NET.

I would say that, thus, a service manager like 'zdaemon' it's not
actually that useful on Windows *unless* it implements a Windows
Service. In fact, I could see it being used as both a 'standalone
service manager' and as a simple service with the NT Service
Controller with little overlap, though I would highly discourage the
former.

There's some stuff from zdaemon that would be useful though, and do
not work on Windows today due to some over-unixism in zdaemon, like an
interactive prompt and script runner as 'zopectl debug' and 'zopectl
run', I'm sure those two don't need to know about 'fork' or signals.

What I'm really interested in is in how the service would communicate
with the program being controlled. This is the painful part, and where
I think we need to work together to make sure it works on Windows and
on *nix platforms. You can surely count on me to discuss that part.

As I mentioned on another thread, Zope uses 'signals' on *nix, and
'named events' on Windows, by means of the 'Signals' package in Zope.
We could possibly re-use that.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From jtate at rpath.com  Wed Mar  7 14:46:33 2007
From: jtate at rpath.com (Joseph Tate)
Date: Wed, 7 Mar 2007 08:46:33 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
References: <45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
Message-ID: <200703070846.33887.jtate@rpath.com>

On Wednesday 07 March 2007 05:34:15 Jim Fulton wrote:
> I'd love to get some input who know a lot about what makes deploying
> PHP apps so easy.

It's not the packaging format.  Most php apps come down as a tarball.  Extract 
it to your apache root, and you can connect to the app and do configuration, 
without even restarting apache (thanks to mod_php).   I think the key thing 
is that configuring a "well written" php app is done through the web 
interface.  No mucking with config files, no apache configuration required, 
etc.  Just have to create a database and a user with permissions to it.  If 
you'd like a specific example, I suggest trying to install gallery 
(http://gallery.menalto.com).

There are sacrifices to make for this approach though: the app has to be able 
to write at least to its own config file, and to .htaccess.  This means that 
security has to be super tight.  Frequently the instructions are to chmod 777 
the app's top level directory, configure, and then unchmod.  Because so many 
things can be modified via .htaccess, including directory specific php 
settings, you rarely need further configuration.

-- 
Joseph Tate
Software Engineer
rPath Inc.
http://www.rpath.com/rbuilder/
(919) 851-3984 x2106

From rodsenra at gpr.com.br  Wed Mar  7 15:34:14 2007
From: rodsenra at gpr.com.br (Rodrigo Senra)
Date: Wed, 7 Mar 2007 11:34:14 -0300
Subject: [Web-SIG] daemon tools
In-Reply-To: <a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
	<a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
Message-ID: <20070307113414.5ee7384a@Fenix>


[ Sidnei da Silva ]:
|The service, after being registered can be managed with standard tools
|present on the system:
|
|C:\src>net stop bthserv
# cut
|C:\src>net start bthserv
# cut
|C:\src>sc \\pena queryex bthserv
# cut
|C:\src>sc \\pena queryex xmlprov
# cut

And, I am sure you are aware of that, the service can also be managed
by Python through win32all:

<code>
    # random samples from a python service watchdog ;o)
    hscm = win32service.OpenSCManager(None,
                                      None,
                                      win32service.SC_MANAGER_ALL_ACCESS)

    hsvc = win32service.OpenService(hscm, 
                                    service, 
                                    win32service.SERVICE_ALL_ACCESS)
    status = win32service.QueryServiceStatus(hsvc)
    # code to test status and decide to restart it (or not) omitted
    win32service.StartService(hsvc,None)
</code>

|I would say that, thus, a service manager like 'zdaemon' it's not
|actually that useful on Windows *unless* it implements a Windows
|Service.

For symmetry's sake in Windows a Python service manager could simply
use SCManager API under the hood (through win32all) to get the job done,
still keeping a consistent cross-platform modus operandi.

| In fact, I could see it being used as both a 'standalone
|service manager'

Do you mean a wrapper for native SCManager services ?


|There's some stuff from zdaemon that would be useful though, and do
|not work on Windows today due to some over-unixism in zdaemon, like an
|interactive prompt and script runner as 'zopectl debug' and 'zopectl
|run', I'm sure those two don't need to know about 'fork' or signals.
|
|What I'm really interested in is in how the service would communicate
|with the program being controlled. This is the painful part, and where
|I think we need to work together to make sure it works on Windows and
|on *nix platforms. You can surely count on me to discuss that part.

One naive suggestion would be to wrap Unix signals and Windows Event
Objects under a single signaling abstraction. If what you meant by
"communicate" can be restricted to flag-waving (and *not* some general
data structure IPC), then these mechanisms should suffice.
At least, I can say that Windows (manual reset) Event Objects are simple,
robust (even in multi-threaded scenarios), and reasonably cross-platform
from within the Windows family, IMHO.

|As I mentioned on another thread, Zope uses 'signals' on *nix, and
|'named events' on Windows, by means of the 'Signals' package in Zope.
|We could possibly re-use that.

Great, just checked that out. I think that is the way to go.

Cheers,
Senra

-------------
Rodrigo Senra
GPr Sistemas 
http://www.gpr.com.br

From sidnei at enfoldsystems.com  Wed Mar  7 16:44:17 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Wed, 7 Mar 2007 12:44:17 -0300
Subject: [Web-SIG] daemon tools
In-Reply-To: <20070307113414.5ee7384a@Fenix>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
	<a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
	<20070307113414.5ee7384a@Fenix>
Message-ID: <a7a2b76b0703070744r66316c1bge2cfc70f4eb13b3c@mail.gmail.com>

On 3/7/07, Rodrigo Senra <rodsenra at gpr.com.br> wrote:
> And, I am sure you are aware of that, the service can also be managed
> by Python through win32all:
>
# snip

Yeah, sorry. I thought that was pretty obvious, but I realize it wasn't *wink*.

> For symmetry's sake in Windows a Python service manager could simply
> use SCManager API under the hood (through win32all) to get the job done,
> still keeping a consistent cross-platform modus operandi.

Your suggestion is indeed quite appealling. I feel sad for not having
thought of that before. zdaemon could be just a wrapper for SCManager
and that is certainly the way to go.

> |What I'm really interested in is in how the service would communicate
> |with the program being controlled. This is the painful part, and where
> |I think we need to work together to make sure it works on Windows and
> |on *nix platforms. You can surely count on me to discuss that part.
>
> One naive suggestion would be to wrap Unix signals and Windows Event
> Objects under a single signaling abstraction. If what you meant by
> "communicate" can be restricted to flag-waving (and *not* some general
> data structure IPC), then these mechanisms should suffice.

Yes, in the case of Zope that's mainly abstracting SIGINT, SIGHUP, etc.

> |As I mentioned on another thread, Zope uses 'signals' on *nix, and
> |'named events' on Windows, by means of the 'Signals' package in Zope.
> |We could possibly re-use that.
>
> Great, just checked that out. I think that is the way to go.

I hope that others can agree too.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From fumanchu at amor.org  Wed Mar  7 19:47:10 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Wed, 7 Mar 2007 10:47:10 -0800
Subject: [Web-SIG] daemon tools
In-Reply-To: <475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8D2C0@ex9.hostedexchange.local>

Jim Fulton wrote:
> On Mar 5, 2007, at 1:38 PM, Robert Brewer wrote:
> > ...where the "pywebd" module:
> >
> >  1. Composes the WSGI stack (provides a library to do so at least),
> >  2. Notifies frameworks of site-wide events (like start, 
> stop, restart
> > and graceful),
> >  3. Provides plugins that frameworks can "notify"; for 
> example, adding
> > files to an autoreload plugin.
> 
> This sounds great to me.

I wasn't expecting such quick agreement. ;)

For anyone's information, I've started developing just such a beast in
the CherryPy trunk:
http://www.cherrypy.org/browser/trunk/cherrypy/pywebd

CherryPy will probably continue to distribute it as a subpackage just
for ease of install, but it won't have any CP dependencies. If others
are really interested in developing this collaboratively, I'd be happy
to make it its own project and solicit committers. In particular,
there's no "webctl" module yet (because we need more discussion on its
role before I commit to a direction).

> I see a division of responsibilities between:
> 
> * A facility for managing an application process
> 
>    - start/stop/status/etc
> 
>    - passing environment variables, providing some logging 
>      support if necessary (especially for applications that
>      spew to standard err/out).
> 
>    - Optionally providing other daemon behaviors like
>      disconnecting from the controlling terminal, changing
>      user, etc.  zdaemon provides this service on behalf of
>      applications.
> 
> * A main program that provides common application-level 
> services like the ones you describe above.
> 
>    - Optionally providing other daemon behaviors like
>      disconnecting from the controlling terminal, changing
>      user, etc.  ll.daemon provides some of these services
>      within an application.
> 
> A question is whether to provide the daemonizing support in the main  
> program or in the controlling program.

The "main program" should have the daemonization support. This would
allow framework authors to continue providing "quickstart" and stop
calls to their users as a full-featured alternative to invoking the
controlling program (where "full-featured" includes daemonization,
etcetera). IMO the controlling program ("webctl") wouldn't do any of
your "optional daemon behaviors"; instead, it would be a command-line
way to specify/collect an environment (including config files), start
the main program, and then asynchronously send messages to the main
program like "stop" and "status". It would run, execute a command, and
then exit (much like apachectl does).

This is also pretty much how I see zdctl operating, with a few areas I'd
like to investigate:

 1. I would very much like webctl to be the component that understands a
WSGI-composition config format or formats. Or rather, I don't want
pywebd to fuss with that--pywebd should understand the entry points and
use/expose an API for composing a WSGI stack, but that should be an
imperative API, so that frameworks can do their own composition for the
user. For example, TG silently adds URL handlers for Mochikit (that
shouldn't have to be included in a config file by the user).
 2. AF_UNIX isn't available on Windows. I'd like to find ways of passing
status back from pywebd to webctl that don't involve a socket.
 3. zdctl spawns zdrun (right?). I'd like webctl to spawn pywebd, but
currently I'm calling the whole package "pywebd". I probably need to
change:

/pywebd
    __init__.py
    base.py
    plugins.py
    win32.py

...to a more separated arrangement:

/pyweb (other name ideas most welcome)
    __init__.py
    base.py
    plugins.py
    pywebd(.exe)
    unix.py
    webctl(.exe)
    win32.py

> Note that in answering this question, we probably need to have an
> idea how this will work on windows.  If Unix-specific daemonizing
> code is in the main application, then the application won't be
> portable. Of course, if the main program is generic, it might not
> be a big deal to have separate versions for Windows and Unix.

My hope is that pywebd will have a "win32" module (as my initial foray
does). Perhaps I should move the daemonization plugin to a "unix"
(posix?) module.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From ianb at colorstudy.com  Wed Mar  7 22:49:38 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 07 Mar 2007 15:49:38 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
References: <45E99DC1.4010703@zetaweb.com>
	<45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
Message-ID: <45EF3372.4020007@colorstudy.com>

Jim Fulton wrote:
>> A couple years back, I started writing a library to parse a more  
>> sophisticated, Python-like syntax to do the same sorts of things,  
>> but only got as far as the parser.
> 
> A few years back, we created a library to parse more sophisticated  
> apache-like syntax and I wish we hadn't.  The ini/config format is  
> pretty standard and, IMO, really quite adequate.  I'm convinced that  
> we don't really need another configuration format, at least not at  
> this level.

Details of the structure aside, I've found string:string dictionaries 
entirely sufficient for expressing every configuration I've wanted to 
do.  I'm very happy that Paste Deploy doesn't support Python syntax for 
anything.

>> that could get stdlib support and ultimately hosting company  
>> support.  This would actually give us a leg up on even PHP for ease- 
>> of-deployment.
> 
> Aside from the universal configuration file issue, I think this would  
> be a terrific thing for us to focus on.  Something I hear a lot is  
> how much easier PHP applications are to deploy to hosting providers.   
> I would *love* it is Python had a similar story, even if only for  
> smaller applications.
> 
> I'd love to get some input who know a lot about what makes deploying  
> PHP apps so easy.

Well, it's a big help that PHP doesn't have Python's import system.  Oh 
how I hate Python imports... anyway, since it just uses the filesystem 
everything is kind of naturally hierarchical and isolated.  There are 
some system-wide configurations (in php.ini) -- these cause deployers a 
lot of pain.  But they are mostly overridable with .htaccess, I think. 
Also there's not many libraries, and what libraries there are are 
typically shipped with the applications.  PEAR (the PHP library system) 
started after I stopped doing much of any PHP, so I don't know how it 
effects things.

PHP also gets a lot of benefit from a CGI-like execution model.  There's 
a ton of crap that gets swept under the rug by this -- lots of memory 
leaks, for instance.  As they've been building up larger frameworks 
built from PHP code, the CGI-like execution speed has also been hitting 
them.  But since they have a fairly large library written in C (that is 
persistent and shared) it's usually pretty reasonable; it's just when 
they tried to copy Rails that it started really biting them.

I think the database drivers are a bit of a red herring.  What 
extensions PHP has been compiled with is pretty fixed by the hosting 
provider -- they just happen to all provide database drivers for the 
databases they support.  Which is kind of a no-brainer; if they *cared* 
about Python they'd easily be able to do the same for Python.  It would 
help if Python shipped with one or two, but eh.

Anyway, my feelings are that it's: (a) simple hierarchy through the 
filesystem (which will make Chad all excited ;), (b) reliability of the 
CGI model, and (c) hosting providers give a damn.  We can't do much 
about (c).  (a) requires an isolation tool, but we have a few now.  (b) 
still needs doing.

That Python is theoretically faster than PHP due to its typical 
execution model doesn't mean much to hosting providers.  They tend to be 
memory-constrained more than CPU constrained anyway.  And if you have 
slow code, you personally suffer -- but if you use lots of memory, you 
make everyone suffer.  One thing many hosts do is just periodically kill 
user's processes if they hang around too long.  Most don't seem to care 
if you have long-running processes, though I've heard a few might 
disable your account.

Someone (but I've forgotten who) suggested a technique to assist with 
this.  The SCGI package has a script cgi2scgi, just a simple CGI script 
written in C that sends the request to another server; I think just a 
port, but I'm sure it could be extended easily enough to send it to a 
named socket.  Anyway, if there was just a bit of process management 
code in that script it could also serve as a launcher, doing on-demand 
launching of a server (Flup I suppose) and then passing it on to that 
script.  FastCGI does all these things, but setup can be fairly 
complicated and many implementations are buggy.  Anyway, extending 
cgi2scgi to do this, along with some isolated environment, should be a 
fairly simple way to make Python hosting on commodity hosts a lot easier.

Some of the hosts only give FTP access, and may not have a compiler.  So 
ideally you could assemble everything on your workstation and upload it 
in batch.  Probably a single Linux executable would be fine -- FreeBSD 
should be able to run it fine, and everything that matters (for this use 
case) is Linux or FreeBSD.  Hopefully Sidnei won't mind that we leave 
Windows out ;) -- commodity Windows hosting is another situation 
entirely (about which I know nothing).

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From eucci.group at gmail.com  Wed Mar  7 23:55:10 2007
From: eucci.group at gmail.com (Jeff Shell)
Date: Wed, 7 Mar 2007 15:55:10 -0700
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
Message-ID: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>

On 3/7/07, Jim Fulton <jim at zope.com> wrote:
>
> On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote:
> ...
> > Personally, I don't care for the Paste Deploy syntax -- frankly
> > it's almost barbaric.  :)
>
> I don't mean to pick on you, but I really *hate* comments like this.
> I don't like softer forms like "complicated" or even "makes me
> uneasy".  It would be far more helpful if you provides specific
> criticism.  I'd appreciate it if we would all just ignore statements
> like this and, preferably, stop making them.

I agree.

A problem I have is that I see these files with their syntax and I
balk. I don't think it's the syntax that's at issue as much as it is
that there's now a new set of terms that I don't understand. 'Entry
Point' is one that that shorted out my brain for a long time whenever
I'd try to look at the Paste docs to figure out what Paste was.

I think I hold Python to a different standard as I want to know what
something is doing. I don't think about this when I configure Apache.
I just know that very few of my Zope 3 terms map to Paste terms, and
all of this talk of 'filters' and 'entry points' and the like... I
look at it and go "huh, interesting." And then it's back to work on my
own thing.

...
> > A couple years back, I started writing a library to parse a more
> > sophisticated, Python-like syntax to do the same sorts of things,
> > but only got as far as the parser.
>
> A few years back, we created a library to parse more sophisticated
> apache-like syntax and I wish we hadn't.  The ini/config format is
> pretty standard and, IMO, really quite adequate.  I'm convinced that
> we don't really need another configuration format, at least not at
> this level.

While we're all talking about what we did or did not make, I found
that I wanted a lot more direct control than zc.buildout gave me.
After growing frustrated with writing Recipes and having to mentally
manage the glue between a config file that was like a make file (it
makes a lot of things) but not like a Rake file (no ability to include
my own programming logic within the buildout spec, only in recipes), I
took inspiration from Rake (a Ruby tool) and wrote a tool that looks
for `Rockfile`, which is basically a Python file (no .py extension so
as to avoid accidental imports).

I still don't *really* understand Eggs, nor how to get them to work
easily within individual Zope 3 instances. None of our existing Zope 3
libraries / apps are written as eggs or even as distutils-installable
packages. We just check our packages directly out of CVS, and
typically just check out other libraries from their repositories as
well. We dump them right in $INSTANCE_HOME/lib/python (a layout that I
actually like) and can then rest assured that a newly deployed app's
need/use of SQLAlchemy 0.3.4 doesn't interfere with an already running
app's need/use of SQLAlchemy 0.2.8.

A further benefit of having the Rockfile system is that they can be
used for other tasks done during development, such as updating
MochiKit, generating a special 'NoExport.js' file, and then packing a
few different combinations of MochiKit together.

    from rocketbuild.api import *
    from string import Template

    ns = namespace('mochikit')
    ROCKFILEPATH = globals().get('ROCKFILEPATH', path('.'))
    MOCHIKIT_LIB = ROCKFILEPATH/'libs'/'mochikit'
    MOCHIKIT_DL = ROCKFILEPATH/'mochikit_dl'
    MOCHIKIT_SRC = MOCHIKIT_DL/'MochiKit'
    SCRATCH = MOCHIKIT_LIB/'_scratch.js'
    CLEANUP = [MOCHIKIT_DL]

    NOEXPORT = Template("""\
    /*
     * Built for MochiKit SVN Checkout ${revision}
     */
    var MochiKit = { __export__: false };
    """)

    @ns.task('get')
    def getmochikit():
        if MOCHIKIT_DL.exists() and bool(MOCHIKIT_DL.listdir()):
            return
        svn = Subversion('http://svn.mochikit.com/mochikit')
        svn.co('trunk', target=MOCHIKIT_DL)

    @ns.task('clearmochilib')
    def clearmochilib():
        for jscript in MOCHIKIT_LIB.files('*.js'):
            jscript.remove()

    @ns.task('make-noexport')
    def makenoexport():
        info = Subversion().info(MOCHIKIT_DL)
        src = NOEXPORT.safe_substitute(**info)
        file(MOCHIKIT_LIB/'NoExport.js','w').write(src)

    @ns.task('build', ['get', 'clearmochilib', 'make-noexport'])
    def mochi_install():
        for source in MOCHIKIT_SRC.files('*.js'):
            log.info('copy %s -> %s' % (source, MOCHIKIT_LIB))
            source.copy(MOCHIKIT_LIB)

    @task('clear')
    def clear():
        for p in filter(path.exists, paths(*CLEANUP)):
            log.info('rmtree: %s', p.name)
            p.rmtree()
        if SCRATCH.exists():
            SCRATCH.remove()

I guess I'm just a control freak. It was too hard to control Buildout
to build out something that matches the way we've worked for years; it
was easier to write a tool from scratch. Which I think is the Python
way, for better or worse.

Anyways, this is the tool that we're starting to use at Bottlerocket
to automate our deployments as they grow more complex.

> ...
>
> > Anyway, all that aside, I think it would be fantastic if we could
> > come up with some "universal file format" for single-file
> > configuration and deployment of applications (including auto-
> > install of all needed eggs),

Configuration and deployment?

I'm trying to understand the scope of these terms (or this combined
term) better. I take it 'configuration' means just how an 'app' might
publish itself to a WSGI server. Is that right?

For us, deployment now is:

1. Make a Zope 3 instance home ('appserv1')
2. `cd appserv1/lib/python; cvs checkout customerapp`
3. `rockout -vv customerapp/Rockfile install` (installs dependencies, mostly
   by CVS / Subversion checkout, usually directly into `appserv1/lib/python`)
4. `cd ../../etc` (back to 'appserv1/etc')
5. choose a port number in zope.conf (the zope/twisted server config)
6. add two lines to Zope 3's `site.zcml` to set up our app:

    <include package="customerapp" file="site.zcml"/>
    <include package="customerapp" file="etc/deployment.zcml"/>

    The first line is a single file that sets up all of the dependencies and
    includes them in the proper order (probably only of interest, maybe, to
    other Zope 3 people). Basically this is my startup for my application
    within the Zope 3 application.

    The second line refers to configuration settings for machine local
    resources (database connections, cached resource directories, and so on).
    This may be written at deployment time. We keep it within the app so that
    it stays under source control, and also lets us know the names of services
    on which we may depend. This is also Zope 3 specific. I don't know of
    any way in which a configuration tool could be generic enough to handle
    any of this - even something as generic as a dburi string - unless it
    was restricted to handling ONLY basic values.

7. add site info to apache (rewriterule(s) /  proxy).

Is this analogous to the deployment and configuration being discussed?
Or is the desired outcome really one where I hand someone a tarball
and/or config file/script which would bring in (or have) ALL of the
Zope 3 framework along with my application and its dependents, ALL in
a way that doesn't trample on anyone/anything else (completely self
contained), and that someone can then add a line or two to the web
server's config file (if even that) and it all just runs? I guess Jim
may be the only one with the Zope 3 knowledge to answer this.

...
> > that could get stdlib support and ultimately hosting company
> > support.  This would actually give us a leg up on even PHP for ease-
> > of-deployment.
>
> Aside from the universal configuration file issue, I think this would
> be a terrific thing for us to focus on.  Something I hear a lot is
> how much easier PHP applications are to deploy to hosting providers.
> I would *love* it is Python had a similar story, even if only for
> smaller applications.
>
> I'd love to get some input who know a lot about what makes deploying
> PHP apps so easy.

I believe it's been said already that many PHP apps can just be
un-tarred/gzipped. Plus, PHP has the benefit of being basically built
in to Apache. Most hosting providers can enable PHP for individual
accounts in a snap. So in many cases, deploying a PHP app is seldom
any harder than deploying a static web site.

Granted, there are more advanced applications, and I don't know how
they get packaged or installed.

Perhaps PHP is an unfair case to look at: it's built in, and isn't
terribly complex. It's an easy processor directive. A Pylons,
Turbogears, or Zope 3 'app' isn't a bunch of .psp files that are
executed automatically by Apache. A more fair case to look at is Java
application deployment - maybe. I have no experience (yay!) with this.

I'm still a bit confused by the "write with any framework, deploy on
any server" line I've heard from the Servlet/J2EE world. I think I've
always considered it all to be one and the same, coming from my long
history with Zope, I've thought "if I program against Zope, I serve
from Zope."

But in theory, since Zope 3 has `zope.app.wsgi`, I could serve from...
anything? I guess that since I don't think about serving via Twisted
any more than I thought about serving via ZServer, I could put
CherryPy, mod_wsgi, whatever else underneath, right?

Sorry if that's a lot of questions. I'm still trying to grasp everything.

-- 
Jeff Shell

From graham.dumpleton at gmail.com  Thu Mar  8 00:04:44 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Thu, 8 Mar 2007 10:04:44 +1100
Subject: [Web-SIG] WSGI server/adapter and sys.exit()/SystemExit exception.
Message-ID: <88e286470703071504u2459bcb6r3969cff7fe2a06d5@mail.gmail.com>

Since discussion is moving towards look at defining responsibilities
of the container or environment that a WSGI application runs in,
thought it would be a good time to ask this question.

The question is, if a WSGI application calls sys.exit() or raises a
SystemExit exception explicitly, what action if any should a WSGI
server/adapter take in response. Should it allow the process to be
shutdown, or should it ignore it.

If the WSGI server/adapter doesn't ignore it, then you run the risk of
a WSGI application shutting down your whole web server if everything
runs within the one process. In the case of a web server where
applications run in multiple spawned child process, eg Apache, then
you only affect the one process that the request was handled within.
Even so, in the case of Apache, if the worker MPM was being used and
thus there could be requests being handled in parallel in the same
process, maybe not even as Python requests, but static file requests,
PHP requests, CGI etc, then these other requests would still be
affected by the process being killed.

Thus to my mind any WSGI server/adapter should possibly always ignore
a SystemExit exception coming from with an executing WSGI application.
One though also has to worry about SystemExit exceptions raised as a
side effect of a Python import performed to load a WSGI application.
Then you potentially have the issue of SystemExit exception raised
from thread spawned by WSGI application.

What are other peoples thoughts on this. Should one try and protect
the container application from abuse of SystemExit by a WSGI
application or should one simply trust the application writer? In a
shared web hosting environment can someone ever trust an application
writer in this way though?

Comments?

Graham

From fumanchu at amor.org  Thu Mar  8 00:13:20 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Wed, 7 Mar 2007 15:13:20 -0800
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local>

Jeff Shell wrote:
> Configuration and deployment?
> 
> I'm trying to understand the scope of these terms (or this combined
> term) better. I take it 'configuration' means just how an 'app' might
> publish itself to a WSGI server. Is that right?
> 
> For us, deployment now is:
> 
> 1. Make a Zope 3 instance home ('appserv1')
> 2. `cd appserv1/lib/python; cvs checkout customerapp`
> 3. `rockout -vv customerapp/Rockfile install` (installs 
> dependencies, mostly
>    by CVS / Subversion checkout, usually directly into 
> `appserv1/lib/python`)
> 4. `cd ../../etc` (back to 'appserv1/etc')
> 5. choose a port number in zope.conf (the zope/twisted server config)
> 6. add two lines to Zope 3's `site.zcml` to set up our app
> 7. add site info to apache (rewriterule(s) /  proxy).
> 
> Is this analogous to the deployment and configuration being discussed?

Yes, although I want to make sure we keep discussion of 'site
installation' very separate from 'website composition' (where you
already have all the pieces and just need to declare where they are and
how they map to URL's). IMO site installation is a 3 to 5-year project;
website composition is a one-year project that shouldn't get bogged down
in the former.

> But in theory, since Zope 3 has `zope.app.wsgi`, I could serve from...
> anything? I guess that since I don't think about serving via Twisted
> any more than I thought about serving via ZServer, I could put
> CherryPy, mod_wsgi, whatever else underneath, right?

In theory, yes. For example, you should be able to put CherryPy's WSGI
server underneath. Most of the rest of CherryPy (the app framework bits)
are not directly *connectable* to the rest of Zope, but one of the
dreams of WSGI is that you could *compose* a site using apps from
multiple frameworks. See the diagram at the bottom of
http://www.cherrypy.org/wiki/WSGI for example, which shows all of the
places you can connect foreign WSGI components with CherryPy WSGI
components.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From fumanchu at amor.org  Thu Mar  8 00:25:07 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Wed, 7 Mar 2007 15:25:07 -0800
Subject: [Web-SIG] WSGI server/adapter and sys.exit()/SystemExit
	exception.
In-Reply-To: <88e286470703071504u2459bcb6r3969cff7fe2a06d5@mail.gmail.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8D969@ex9.hostedexchange.local>

Graham Dumpleton wrote:
> The question is, if a WSGI application calls sys.exit() or raises a
> SystemExit exception explicitly, what action if any should a WSGI
> server/adapter take in response. Should it allow the process to be
> shutdown, or should it ignore it.
> 
> ...to my mind any WSGI server/adapter should possibly always ignore
> a SystemExit exception coming from with an executing WSGI application.
> One though also has to worry about SystemExit exceptions raised as a
> side effect of a Python import performed to load a WSGI application.
> Then you potentially have the issue of SystemExit exception raised
> from thread spawned by WSGI application.

For now, I'd try to give deployers using my tools direct control over
whether an application is allowed to stop the process or not via
SystemExit.

In a future pywebd/webctl world, I'd like to see process
shutdown/restart delegated to plugins only, which can then be
attached/detached by deployers. For example, the current pywebd
autoreload plugin can call os.execv; if you're deploying with mod_python
that autoreload plugin is simply never attached and therefore cannot
call execv. If deploying by calling a hypothetical 'webctl' script, I
would expect a command-line arg or config entry which controlled whether
or not to plug in the autoreloader. Finally, when deploying from
CherryPy itself, it plugs in (and configures) the autoreloader based on
the existing CherryPy config semantics.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From ianb at colorstudy.com  Thu Mar  8 00:35:35 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 07 Mar 2007 17:35:35 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>
References: <45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>
Message-ID: <45EF4C47.8060700@colorstudy.com>

Jeff Shell wrote:
> But in theory, since Zope 3 has `zope.app.wsgi`, I could serve from...
> anything? I guess that since I don't think about serving via Twisted
> any more than I thought about serving via ZServer, I could put
> CherryPy, mod_wsgi, whatever else underneath, right?

In theory you can set up Zope 3 using something like:

   [app:main]
   paste.app_factory = some_function_yet_to_be_written

I thought zope.paste did this, but it's a little wonky now that I look 
at it.  It seems to basically read INSTANCE_HOME and create a single 
Zope WSGI app, and then kind of minimally plug into it.  That function 
would more ideally look like:

   from zope.app.wsgi import getWSGIApplication

   def make_zope_app(global_conf, instance_home=None, configfile=None):
       if configfile is None:
           configfile = global_conf.get('configfile')
       if configfile is None:
           if instance_home is None:
               instance_home = global_conf.get('instance_home')
           if not instance_home:
               raise ValueError(
                   'You must give a configfile or instance_home value')
           configfile = os.path.join(instance_home, 'etc', 'zope.conf')
       app = getWSGIApplication(configfile)
       return app

Then in Zope's setup.py:

   setup(...
     entry_points="""
     [paste.app_factory]
     main = zope.some_module:make_zope_app
     """)

Then you'd configure it like:

   [app:main]
   use = egg:Zope
   # Same directory as the config file:
   instance_home = %(here)s
   # instead of "use", and if you didn't set up the entry point:
   paste.app_factory = zope.some_module.make_zope_app

And you'd set up a server like:

   [app:main]
   # CherryPy doesn't natively provide this entry point...
   use = egg:PasteScript#cherrypy
   # or...
   #use = egg:Paste#http, egg:Flup#scgi, etc
   host = 0.0.0.0
   port = 8080

Put both those sections in one file (say, deploy.ini) and then do:

   $ paster serve deploy.ini

And it'll start up.  Additionally, instead of plugging that app directly 
into a server, you could wrap it with different kinds of middleware, 
which is where it starts looking a bit more interesting.  For instance, 
for Paste's interactive debugger:

   [app:main]
   use = egg:Zope ...
   filter-with = egg:Paste#evalerror

Though that probably won't quite work, because we don't all agree on a 
way to indicate to the app that it shouldn't catch unexpected errors 
(Zope uses environ['wsgi.handleErrors']); which is incidentally what 
this proposed spec would help us agree on: 
http://wsgi.org/wsgi/Specifications/throw_errors

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From eucci.group at gmail.com  Thu Mar  8 00:58:20 2007
From: eucci.group at gmail.com (Jeff Shell)
Date: Wed, 7 Mar 2007 16:58:20 -0700
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local>
References: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local>
Message-ID: <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com>

On 3/7/07, Robert Brewer <fumanchu at amor.org> wrote:
> Jeff Shell wrote:
> > Configuration and deployment?
> >
> > I'm trying to understand the scope of these terms (or this combined
> > term) better. I take it 'configuration' means just how an 'app' might
> > publish itself to a WSGI server. Is that right?
> >
> > For us, deployment now is:
> >
> > 1. Make a Zope 3 instance home ('appserv1')
> > 2. `cd appserv1/lib/python; cvs checkout customerapp`
> > 3. `rockout -vv customerapp/Rockfile install` (installs
> > dependencies, mostly
> >    by CVS / Subversion checkout, usually directly into
> > `appserv1/lib/python`)
> > 4. `cd ../../etc` (back to 'appserv1/etc')
> > 5. choose a port number in zope.conf (the zope/twisted server config)
> > 6. add two lines to Zope 3's `site.zcml` to set up our app
> > 7. add site info to apache (rewriterule(s) /  proxy).
> >
> > Is this analogous to the deployment and configuration being discussed?
>
> Yes, although I want to make sure we keep discussion of 'site
> installation' very separate from 'website composition' (where you
> already have all the pieces and just need to declare where they are and
> how they map to URL's). IMO site installation is a 3 to 5-year project;
> website composition is a one-year project that shouldn't get bogged down
> in the former.

Could you elaborate more on these terms? To whom do the spans 'one
year project' and '3 to 5 year project' apply?

Often we have web apps, written in Zope 3, that are really two or more
web apps. Like an 'admin' side and 'public' side, typically handled
via different skins/views. Apache rewrite rules basically handle that
routing. So in my mind, if I deploy our CMS, I have the following URL
maps:

http://example.com/admin/(.*) => examplesite/++skin++CMSAdmin/$1
http://example.com/(.*) => examplesite/++skin++ExamplePublic/$1

Same Zope application, with just a couple of different settings based
on the incoming URL, and then Zope and our app handles the rest of the
URL.

Is that a site installation? Two site installations? Or two examples
of website composition? Again, I'm just trying to understand the
terminology and map it to the way I'm used to working, and I think of
the above as 'site installation'.

The other tried and true example I can think of is when a customer
asks "uhm, and can we have a forum with that?" I guess website
composition might include the above two URL maps, plus one for:

http://example.com/forum/(.*) => SuperTerrificPylonsWebForumWSGI

But should this be the provence of WSGI? With Apache rewrite rules, if
I was doing such a blunt grafting of 'forum' onto my customer's site,
I could just as easily use phpBB. Then I'm not limiting myself to
Python if I feel there's a better suited tool for a particular task.

I brought up this forum example because it's something we've run into
a couple of times and may be about to encounter again. Depending on
customer needs and wants, one of our thoughts is to just drop in some
PHP bulletin board or some other feature complete app.

So if SuperTerrificPylonsWebForumWSGI is basically a black box - I
configure its colors, templates, etc, but expect no other integration
with the customer's main site / CMS - what benefits might I get from
composing via WSGI?

-- 
Jeff Shell

From ianb at colorstudy.com  Thu Mar  8 01:13:22 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 07 Mar 2007 18:13:22 -0600
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com>
References: <88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>	<435DF58A933BA74397B42CDEB8145A8609E8D931@ex9.hostedexchange.local>
	<88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com>
Message-ID: <45EF5522.8000004@colorstudy.com>

Jeff Shell wrote:
> Often we have web apps, written in Zope 3, that are really two or more
> web apps. Like an 'admin' side and 'public' side, typically handled
> via different skins/views. Apache rewrite rules basically handle that
> routing. So in my mind, if I deploy our CMS, I have the following URL
> maps:
> 
> http://example.com/admin/(.*) => examplesite/++skin++CMSAdmin/$1
> http://example.com/(.*) => examplesite/++skin++ExamplePublic/$1
> 
> Same Zope application, with just a couple of different settings based
> on the incoming URL, and then Zope and our app handles the rest of the
> URL.
> 
> Is that a site installation? Two site installations? Or two examples
> of website composition? Again, I'm just trying to understand the
> terminology and map it to the way I'm used to working, and I think of
> the above as 'site installation'.
> 
> The other tried and true example I can think of is when a customer
> asks "uhm, and can we have a forum with that?" I guess website
> composition might include the above two URL maps, plus one for:
> 
> http://example.com/forum/(.*) => SuperTerrificPylonsWebForumWSGI
> 
> But should this be the provence of WSGI? With Apache rewrite rules, if
> I was doing such a blunt grafting of 'forum' onto my customer's site,
> I could just as easily use phpBB. Then I'm not limiting myself to
> Python if I feel there's a better suited tool for a particular task.
> 
> I brought up this forum example because it's something we've run into
> a couple of times and may be about to encounter again. Depending on
> customer needs and wants, one of our thoughts is to just drop in some
> PHP bulletin board or some other feature complete app.
> 
> So if SuperTerrificPylonsWebForumWSGI is basically a black box - I
> configure its colors, templates, etc, but expect no other integration
> with the customer's main site / CMS - what benefits might I get from
> composing via WSGI?

Well, here's how you might do it in Paste Deploy:

[composite:main]
use = egg:Paste#urlmap
/ = cms
/admin = admin_cms
/forum = forum
/forum_phpBB = forum_phpBB

[app:cms]
use = Zope
instance_home = %(here)s/zope
root_object = examplesite
default_view = ExamplePublic

[app:admin_cms]
use = cms
default_view = CMSAdmin

[app:forum]
use = egg:SuperTerrificPylonsWebFormWSGI
database = mysql://localhost/form_db

[app:form_phpBB]
use = egg:wphp
base_dir = %(here)s/phpBB


But then lets say you want all these pieces to look similar:

[composite:main]
...
/_theme_files = theme_files
filter-with = deliverance

[app:theme_files]
use = egg:Paste#static
document_root = %(here)s/theme_files

[filter:deliverance]
use = egg:Deliverance
theme_uri = /_theme_files/blank_theme.html
rule_uri = /_theme_files/rules.xml


And then all the content, regardless of its source (could be PHP, piped 
in via HTTP, or static files) gets piped through Deliverance which wraps 
them all in the same outer theme.  An even more common use would be to 
wrap everything in an authentication middleware that sets REMOTE_USER, 
something that can even be used by PHP apps (at least some PHP apps, 
like WordPress, make using this kind of authentication pretty easy).

You can mostly do all this stuff via passing HTTP around, and I actually 
really like the ability to easily do HTTP requests based on a WSGI 
request, but it's a lot easier to exchange request information in WSGI 
than HTTP by itself.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From fumanchu at amor.org  Thu Mar  8 01:21:35 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Wed, 7 Mar 2007 16:21:35 -0800
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <88d0d31b0703071558t179d13cdi3fc3b167f1a789be@mail.gmail.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A8609E8DA77@ex9.hostedexchange.local>

Jeff Shell wrote:
> On 3/7/07, Robert Brewer <fumanchu at amor.org> wrote:
> > Jeff Shell wrote:
> > > Configuration and deployment?
> > >
> > > I'm trying to understand the scope of these terms (or 
> this combined
> > > term) better. I take it 'configuration' means just how an 
> 'app' might
> > > publish itself to a WSGI server. Is that right?
> > >
> > > For us, deployment now is:
> > >
> > > 1. Make a Zope 3 instance home ('appserv1')
> > > 2. `cd appserv1/lib/python; cvs checkout customerapp`
> > > 3. `rockout -vv customerapp/Rockfile install` (installs
> > > dependencies, mostly
> > >    by CVS / Subversion checkout, usually directly into
> > > `appserv1/lib/python`)
> > > 4. `cd ../../etc` (back to 'appserv1/etc')
> > > 5. choose a port number in zope.conf (the zope/twisted 
> server config)
> > > 6. add two lines to Zope 3's `site.zcml` to set up our app
> > > 7. add site info to apache (rewriterule(s) /  proxy).
> > >
> > > Is this analogous to the deployment and configuration 
> being discussed?
> >
> > Yes, although I want to make sure we keep discussion of 'site
> > installation' very separate from 'website composition' (where you
> > already have all the pieces and just need to declare where 
> they are and
> > how they map to URL's). IMO site installation is a 3 to 
> 5-year project;
> > website composition is a one-year project that shouldn't 
> get bogged down
> > in the former.
> 
> Could you elaborate more on these terms? To whom do the spans 'one
> year project' and '3 to 5 year project' apply?

I meant those terms to apply to web-sig and any work we do on this list
to produce specs, libraries, or tools to address such domains in a
common fashion. That is, I think it would take 3 to 5 years for web-sig
to produce a 'site installation' tool (although leveraging setuptools
could be part of this timeframe), but only a year to produce a initial,
reasonable spec or tool for composing and controlling websites built
with WSGI components.

> Often we have web apps, written in Zope 3, that are really two or more
> web apps. Like an 'admin' side and 'public' side, typically handled
> via different skins/views. Apache rewrite rules basically handle that
> routing. So in my mind, if I deploy our CMS, I have the following URL
> maps:
> 
> http://example.com/admin/(.*) => examplesite/++skin++CMSAdmin/$1
> http://example.com/(.*) => examplesite/++skin++ExamplePublic/$1
> 
> Same Zope application, with just a couple of different settings based
> on the incoming URL, and then Zope and our app handles the rest of the
> URL.
> 
> Is that a site installation? Two site installations? Or two examples
> of website composition? Again, I'm just trying to understand the
> terminology and map it to the way I'm used to working, and I think of
> the above as 'site installation'.

In my book, that would be one site, two apps (and in this example, one
framework). And I never use the word "installation" to describe the
site; to me it's always used as an adjective (as in the phrase
"installation process").

> The other tried and true example I can think of is when a customer
> asks "uhm, and can we have a forum with that?" I guess website
> composition might include the above two URL maps, plus one for:
> 
> http://example.com/forum/(.*) => SuperTerrificPylonsWebForumWSGI
> 
> But should this be the provence of WSGI?

It's reasonable to ask for that, IMO. Many people are already using WSGI
to do just that sort of mixing. The issue we're discussing is that there
are currently several ways to declare/compose a stack of WSGI
components, and we'd like to see if we can standardize.

> With Apache rewrite rules, if
> I was doing such a blunt grafting of 'forum' onto my customer's site,
> I could just as easily use phpBB. Then I'm not limiting myself to
> Python if I feel there's a better suited tool for a particular task.
> 
> I brought up this forum example because it's something we've run into
> a couple of times and may be about to encounter again. Depending on
> customer needs and wants, one of our thoughts is to just drop in some
> PHP bulletin board or some other feature complete app.

http://www.google.com/search?q=wsgi+php


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From chad at zetaweb.com  Thu Mar  8 04:36:11 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Wed, 07 Mar 2007 22:36:11 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com>
References: <45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com>
Message-ID: <45EF84AB.3040807@zetaweb.com>

James,

Thanks for weighing in.

 >> I'd love to get some input who know a lot about what makes
 >> deploying PHP apps so easy.
 >
 > In a past life I had a fair amount of experience working with
 > and deploying PHP, so I'll throw in my $0.02.
 >
 > It's worth pointing out that a lot of the "PHP is easier"
 > perception is largely just that -- a perception.

I don't have tons of PHP experience, but I did just finish 
working on a pretty sizable job, and the deployment was anything 
but easy. Instead it was a brittle amalgam of XML, Apache conf, 
and nasty PHP abstractions. My impression is that PHP is easy for 
simple cases (unpack WordPress and go), but quickly gets ugly 
when you start dealing with frameworks.

So maybe Python is the opposite? Harder for the simple cases, but 
more elegant in the more complicated scenarios.


chad


From chad at zetaweb.com  Thu Mar  8 04:45:53 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Wed, 07 Mar 2007 22:45:53 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45EF3372.4020007@colorstudy.com>
References: <45E99DC1.4010703@zetaweb.com>	<45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<45EF3372.4020007@colorstudy.com>
Message-ID: <45EF86F1.4050709@zetaweb.com>

 > Anyway, my feelings are that it's: (a) simple hierarchy through
 > the filesystem (which will make Chad all excited ;)

BLAM!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

From sidnei at enfoldsystems.com  Thu Mar  8 04:46:05 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Thu, 8 Mar 2007 00:46:05 -0300
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45EF4C47.8060700@colorstudy.com>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<88d0d31b0703071455g746fc952x4217861043e76399@mail.gmail.com>
	<45EF4C47.8060700@colorstudy.com>
Message-ID: <a7a2b76b0703071946s76eda7a9qefb9cd1be03d745f@mail.gmail.com>

On 3/7/07, Ian Bicking <ianb at colorstudy.com> wrote:
> In theory you can set up Zope 3 using something like:
>
>    [app:main]
>    paste.app_factory = some_function_yet_to_be_written
>
> I thought zope.paste did this, but it's a little wonky now that I look
> at it.

Well, you're probably missing something then, from [1]:

"""
[app:Paste.Main]
paste.app_factory = zope.paste.application:zope_publisher_app_factory
"""

> It seems to basically read INSTANCE_HOME and create a single
> Zope WSGI app, and then kind of minimally plug into it.

It's actually a mixed bag. It looks for INSTANCE_HOME to know where to
find paste.ini. The second thing it does is to help register an
IServerType factory so that you can actually run the WSGI app created
with the included-in-zope3 Twisted WSGI server. I don't recall if it
runs with ZServer too, probably does.

So, to some extent, it wasn't meant to make Zope 3 a WSGI that can be
run anywhere, it was just meant to make it possible to use paste to
compose 'a' WSGI app that uses zope.app.publication that could be run
by Twisted or ZServer.

Now, I'm not saying that it can't evolve into something that makes
Zope 3 run as a WSGI anywhere. It just wasn't the original intent.

[1] http://awkly.org/2006/01/25/zopepaste-wsgi-applications-in-zope-3-using-pastedeploy/

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From chad at zetaweb.com  Thu Mar  8 05:55:37 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Wed, 07 Mar 2007 23:55:37 -0500
Subject: [Web-SIG] windows, pywebd, webctl
Message-ID: <45EF9749.3070002@zetaweb.com>

All,

Windows
=======

Sidnei, et al.: your points are well-taken and your expertise 
appreciated. Thanks!


pywebd
======

Bob: I'm on board with your vision for a common server library 
here. Count me in.


webctl/filesystem layout/config syntax
======================================

This is looking less hopeful as a place to collaborate:

   - An executable needs a config file on the command line, and/or
     a config file in a pre-determined place.

   - *Requiring* a config file on the command line is butt-ugly.

   - Our opinions regarding filesystem layout seem to be, um,
     non-overlapping.


I'd like to venture one more round on this, however, before 
giving up on it:

   - It might be the case that Zope only has a few files in an
     INSTANCE_HOME, but I find myself putting quite a bit in a
     site's userland:

       - I'll install Python packages in there wholesale, so I get
         their scripts in bin/, and lots of modules in lib/python.

       - I have multiple configuration files in etc/ (as
         discussed) along with templates in etc/templates/.

       - I put documentation in doc/.

       - I have site-specific utility or cron scripts in bin/.

       - I have extra data files in var/.

     Keeping it all in svn means that a website is very nearly
     self-contained and isolated, requiring not much besides
     Python to be installed in the base system. This is great for
     many-sites-on-one-server.

   - For one-site-on-many-servers, why does a Unix-y userland for
     development conflict with deployment? That is, why can't a
     development userland simply be installed into /usr/local for
     deployment? Surely logging differences could be handled in
     configuration, no?

   - Besides, my proposal only specified two requirements:

       etc/<foo>.conf
       lib/python

     Is there really a Unix sysadmin that would balk at this? This
     is all that's really needed for a common executable to get
     your site online. Lay out the rest however you want.

   - Jim, you hold particular distain for lib/python, but it's
     probably the best example of my "standards enable tools to
     evolve" argument: lib/python buys you distutils, setuptools,
     easy_install, workingenv, etc.

   - This same principle makes sense of runzope, scriptzope, and
     debugzope: standardize the file format (= fs layout), and
     such tools fit perfectly in /usr/local/bin.

   - Almost all of the Windows discussion has centered on daemons
     vs. services. Sidnei, et al.: what does a "native" Windows
     filesystem layout look like for a web application? Is using a
     self-contained Unix-inspired layout faux pas?

   - As mentioned wrt PHP, users like familiar filesystem layouts.
     Reaching agreement here improves our story for newcomers.


A common executable (= common fs layout) may very well be pushing 
the limits of collaboration too far, but I'll feel better about 
admitting that if we pursue the conversation a bit further.


chad


From sidnei at enfoldsystems.com  Thu Mar  8 06:09:47 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Thu, 8 Mar 2007 02:09:47 -0300
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <45EF9749.3070002@zetaweb.com>
References: <45EF9749.3070002@zetaweb.com>
Message-ID: <a7a2b76b0703072109j116fb0d5q18a6481cba8f05dd@mail.gmail.com>

On 3/8/07, Chad Whitacre <chad at zetaweb.com> wrote:
>    - Almost all of the Windows discussion has centered on daemons
>      vs. services. Sidnei, et al.: what does a "native" Windows
>      filesystem layout look like for a web application? Is using a
>      self-contained Unix-inspired layout faux pas?

It depends on what you consider a 'web application':

 - If it's ASP or ASP.NET, it's just a bunch of files dropped in a
directory, just like PHP. It usually has it's configuration in a
'web.config' or similar in the same directory.

 - But typically (well, before IIS 7) a 'web application' was
recommended to be implemented as an ISAPI Extension. That's basically
a DLL that you register through the IIS Management Console.

I could envision an ISAPI Extension that you register for some file
extension (or for '*') and that basically delegates to Paste. Oh, hey,
that sounds like ISAPI WSGI [1][2].

[1] http://code.google.com/p/isapi-wsgi/wiki/ISAPISimpleHandlerDocs
[2] http://pylonshq.com/project/pylonshq/wiki/ServePylonsWithIIS
-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From jim at zope.com  Thu Mar  8 12:45:18 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 8 Mar 2007 06:45:18 -0500
Subject: [Web-SIG] daemon tools
In-Reply-To: <a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
	<a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
Message-ID: <9783FC39-D805-4CDB-BB7A-D68BE963567C@zope.com>


On Mar 7, 2007, at 8:42 AM, Sidnei da Silva wrote:

> On Windows, the NT Service Controller does all the dirty job. And it's
> pretty easy to write a service in Python that can run any application.
> The simplest Python service is shorter than 30 lines I think.

Would such a controller:

- Invoke the application as a subprocess, or

- Be part pf the application.  (For example, would it be more like  
ll.daemon or zdaemon?)

...

> There's some stuff from zdaemon that would be useful though, and do
> not work on Windows today due to some over-unixism in zdaemon, like an
> interactive prompt and script runner as 'zopectl debug' and 'zopectl
> run', I'm sure those two don't need to know about 'fork' or signals.

Note that the scope of zdaemon, as it's name implies was always  
limited to Unix.  If it is reasonable to do so, I'd be happy to see s  
single tool that handled both cases, although if there is a choice  
between a single tool that handled both cases adequately and 2 tools  
that handled both cases well, I'd pick the later.

Also note that the "script runner" feature you mention isn't part of  
zdaemon.  zdaemon  has a subclassing interface, which is currently  
undocumented, that Zope uses to add the "run" and "debug" commands.  
These are Zope specific.

As I mentioned earlier, I'd personally be happy to see the shell mode  
go.


> What I'm really interested in is in how the service would communicate
> with the program being controlled. This is the painful part, and where
> I think we need to work together to make sure it works on Windows and
> on *nix platforms. You can surely count on me to discuss that part.

I think an event model, as Robert Brewer described is a good start.


> As I mentioned on another thread, Zope uses 'signals' on *nix, and
> 'named events' on Windows, by means of the 'Signals' package in Zope.

I'm not familiar with that. :)  So that unifies Unix signals and  
windows events? Interesting.

Jim


--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From chris at simplistix.co.uk  Thu Mar  8 10:36:48 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Thu, 08 Mar 2007 09:36:48 +0000
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
Message-ID: <45EFD930.1040406@simplistix.co.uk>

Jim Fulton wrote:
> 
> Having everything in one folder is great for development.  It isn't so 
> good for deployment, at least not on Unix. 

Can you explain why? I do a lot of unix deployment, and the thought of a 
buildout that sprays files all over the system, even if they are in 
standard unix-y location scares me a lot...

> (I can think of lots of 
> reasons why it wouldn't be great on Wndows either.)  

I'm interested to hear these too since all the microsoft apps I know of 
tend to have a "one folder" model...

> For example, site 
> administrators like to keep log files together and separate from other 
> files.

As a site admin myself, I like to keep log files together, but on a 
per-project basis, I think it's a personal preference thing...

> Even if things are all together, there's really no point in having 
> separate subdirectories, typically containing only one or 2 files, 

Yep, you've persuaded me on that :-)

> single directory containing the few needed files directly.  The only 
> exception to this for me would be to have a subdirectory for Python 
> modules, if you have instance specific Python modules. 

Indeed. Again, I prefer to have all non-standard-library modules and 
packages in the instance home, so different versions don't interfere 
with each other. Yes, this pattern is probably most suited to 
development environment, but being able to svn the whole instance and 
just check that out on the production servers is something I personally 
find very poweful.

> Bit without these, you have something like:
> 
>   zope.conf
>   zopectl
>   runzope
>   debugzope
>   scriptzope
>   Data.fs
>   zope.log
> 
> It is pretty clear that zope.conf is a configuration file, zope.log is a 
> log file, and that Data.fs.  On Unix, It's pretty clear that the others 
> are scripts, because they're executable and, on Windows, they should 
> have .bat or .exe suffxes.

Agreed, I care less about the folders than I thought ;-)
Although if pressed I think I'd still prefer them than not...

> I'm not sure if you are referring to more than scripts.  I agree that we 
> shouldn't have put utility scripts in instances. 

No, it's the utility scripts that I think are a nightmare waiting to 
happen the first time one of them changes as part of a Zope upgrade.

> I would argue that 
> only the ctl script should go in instances.  The runzope, scriptzope, 
> and debugzope scripts could be completely generic and invoked by an 
> instance specific ctl script. 

Exactly :-)

> This is what I do in my latest Zope 3 
> buildout recipes.

Are those recipes available anywhere?

> Only for a particular definition of "works".  No experienced Unix 
> administrator would say it works on Unix. I suspect that a professional 
> Windows server adminstrator would have similar concerns.

I don't agree with either of these at the moment. What's the reasoning 
for wanting to spray files from one project all over the filesystem?

> My original point was not to advocate a particular layout but to point 
> out that different layouts will be needed in different situations and 
> that mandating a particular layout was likely to cause problems.

Yep, now that's something I strongly agree with :-)

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk


From chris at simplistix.co.uk  Thu Mar  8 10:40:42 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Thu, 08 Mar 2007 09:40:42 +0000
Subject: [Web-SIG] windows daemon tools
In-Reply-To: <20070307113414.5ee7384a@Fenix>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>	<a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
	<20070307113414.5ee7384a@Fenix>
Message-ID: <45EFDA1A.5000606@simplistix.co.uk>

Rodrigo Senra wrote:
> For symmetry's sake in Windows a Python service manager could simply
> use SCManager API under the hood (through win32all) to get the job done,
> still keeping a consistent cross-platform modus operandi.

I would love to see this, particularly for Zope, although I sadly don't 
have the skill to implement :-(

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk


From chris at simplistix.co.uk  Thu Mar  8 10:55:49 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Thu, 08 Mar 2007 09:55:49 +0000
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
References: <45E99DC1.4010703@zetaweb.com>
	<45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
Message-ID: <45EFDDA5.4010205@simplistix.co.uk>

Jim Fulton wrote:
> On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote:
> ...
>> Personally, I don't care for the Paste Deploy syntax -- frankly  
>> it's almost barbaric.  :)
> 
> I don't mean to pick on you, but I really *hate* comments like this.   

That's okay ;-)

> criticism.  I'd appreciate it if we would all just ignore statements  
> like this and, preferably, stop making them.

...but I don't think this is. I'd much prefer to hear people's gut 
feelings, even if they can't justify them. It all gives indication. Yes, 
if only one person says "this sucks", then their opinion may not be 
worth changing the implementation for. However, if 50% of users said 
"this sucks", even if they couldn't explain why, that'd be something 
worth worrying about.

> A few years back, we created a library to parse more sophisticated  
> apache-like syntax and I wish we hadn't. 

I'm glad ZConfig exists.

> The ini/config format is  
> pretty standard and, IMO, really quite adequate. 

How does it handle nesting?

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk


From fdrake at gmail.com  Thu Mar  8 13:30:33 2007
From: fdrake at gmail.com (Fred Drake)
Date: Thu, 8 Mar 2007 07:30:33 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45EFDDA5.4010205@simplistix.co.uk>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<45EFDDA5.4010205@simplistix.co.uk>
Message-ID: <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com>

On 3/8/07, Chris Withers <chris at simplistix.co.uk> wrote:
> I'm glad ZConfig exists.

Me too, though it does many things differently than if I'd had free reign.

> How does it handle nesting?

It doesn't, but an application can use explicit references to other
sections.  It doesn't take care of things magically without some
additional help, for which we've avoided premature abstraction.

The .ini format is working quite well for zc.buildout, I think.  The
support for layering multiple files is quite nice, and is completely
explicit.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Every sin is the result of a collaboration." --Lucius Annaeus Seneca

From sidnei at enfoldsystems.com  Thu Mar  8 15:01:03 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Thu, 8 Mar 2007 11:01:03 -0300
Subject: [Web-SIG] daemon tools
In-Reply-To: <9783FC39-D805-4CDB-BB7A-D68BE963567C@zope.com>
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
	<a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
	<9783FC39-D805-4CDB-BB7A-D68BE963567C@zope.com>
Message-ID: <a7a2b76b0703080601w392eb0case1bd6a42c6ed5b71@mail.gmail.com>

On 3/8/07, Jim Fulton <jim at zope.com> wrote:
>
> On Mar 7, 2007, at 8:42 AM, Sidnei da Silva wrote:
>
> > On Windows, the NT Service Controller does all the dirty job. And it's
> > pretty easy to write a service in Python that can run any application.
> > The simplest Python service is shorter than 30 lines I think.
>
> Would such a controller:
>
> - Invoke the application as a subprocess, or
>
> - Be part pf the application.  (For example, would it be more like
> ll.daemon or zdaemon?)
>
> ...

Well, it could be both really. But of course the easiest to integrate
with (from the I-dont-want-to-learn-anything-about-windows
perspective) would be the former.

> > There's some stuff from zdaemon that would be useful though, and do
> > not work on Windows today due to some over-unixism in zdaemon, like an
> > interactive prompt and script runner as 'zopectl debug' and 'zopectl
> > run', I'm sure those two don't need to know about 'fork' or signals.
>
> Note that the scope of zdaemon, as it's name implies was always
> limited to Unix.  If it is reasonable to do so, I'd be happy to see s
> single tool that handled both cases, although if there is a choice
> between a single tool that handled both cases adequately and 2 tools
> that handled both cases well, I'd pick the later.
>
> Also note that the "script runner" feature you mention isn't part of
> zdaemon.  zdaemon  has a subclassing interface, which is currently
> undocumented, that Zope uses to add the "run" and "debug" commands.
> These are Zope specific.

Yeah, so I thought.

> > As I mentioned on another thread, Zope uses 'signals' on *nix, and
> > 'named events' on Windows, by means of the 'Signals' package in Zope.
>
> I'm not familiar with that. :)  So that unifies Unix signals and
> windows events? Interesting.

Well, it doesn't 'unify them' in the sense that you still have to send
Windows named events, but the event name indicates the expected
signal, by using 'Zope-<pid>-<signalid>'. So an event named
'Zope-1214-9' means SIGKILL for the pid 1214 for example.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From janssen at parc.com  Thu Mar  8 15:00:52 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 8 Mar 2007 06:00:52 PST
Subject: [Web-SIG] daemon tools
In-Reply-To: <20070307113414.5ee7384a@Fenix> 
References: <435DF58A933BA74397B42CDEB8145A8609D78BFF@ex9.hostedexchange.local>
	<475A07C6-A2DB-4BAA-9040-7E7BC246EE77@zope.com>
	<a7a2b76b0703070542o617e2c68l9b9000eac6e3064c@mail.gmail.com>
	<20070307113414.5ee7384a@Fenix>
Message-ID: <07Mar8.060056pst."57996"@synergy1.parc.xerox.com>

> For symmetry's sake in Windows a Python service manager could simply
> use SCManager API under the hood (through win32all) to get the job done,
> still keeping a consistent cross-platform modus operandi.

That's what I do in UpLib.  Works pretty well.

Bill

From rodsenra at gpr.com.br  Thu Mar  8 15:02:44 2007
From: rodsenra at gpr.com.br (Rodrigo Senra)
Date: Thu, 8 Mar 2007 11:02:44 -0300
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <45EFD930.1040406@simplistix.co.uk>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
Message-ID: <20070308110244.56b81bd5@Fenix>


[ Chris Withers ]:
|I do a lot of unix deployment, and the thought of
|a buildout that sprays files all over the system, even if they are in 
|standard unix-y location scares me a lot...

 I am very sympathetic to the idea of keeping related thing together.
 But I have some use cases (counter-examples) to contribute:

 - multiple Zope instances sharing libraries, python modules,
   and Zope/Plone Products. These files might be placed out of
   the instance tree.

 - when the Unix Adm is **not SomeFramework-wise** there is (might be)
   a demand to keep backup-electable-stuff somewhere he/she/it 
   wants (like /etc instead of /someApp/etc). Even if with keep
   the files inside app's tree, deploy scripts might have to create
   hard links outside that tree. 
   
 - logs and data (like Data.fs).... see below

|> For example, site administrators like to keep log files
|> together and separate from other files.
|
|As a site admin myself, I like to keep log files together, but on a 
|per-project basis, I think it's a personal preference thing...
|I don't agree with either of these at the moment. What's the reasoning 
|for wanting to spray files from one project all over the filesystem?

 - one optimization (we actually do) is to create different disk
   partitions. One optimized for *large* files (like logs and 
   databases) and other for small files (like source code, libraries
   and config files).

In spite of that, I would love to keep deploys *totally* self-contained.
Nevertheless, I was not wise enough to workaround some of the use cases
presented above ;o)

best regards,
Senra

-------------
Rodrigo Senra
GPr Sistemas 
http://www.gpr.com.br

From benji at benjiyork.com  Thu Mar  8 15:13:54 2007
From: benji at benjiyork.com (Benji York)
Date: Thu, 08 Mar 2007 09:13:54 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <45EFD930.1040406@simplistix.co.uk>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>	<45EDC772.3090803@simplistix.co.uk>	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
Message-ID: <45F01A22.5090301@benjiyork.com>

Chris Withers wrote:
> Jim Fulton wrote:
>> Having everything in one folder is great for development.  It isn't so 
>> good for deployment, at least not on Unix. 
> 
> Can you explain why? I do a lot of unix deployment, and the thought of a 
> buildout that sprays files all over the system, even if they are in 
> standard unix-y location scares me a lot...

I think it depends on the people involved.  As a developer I prefer 
everything in one place.

Our system administrators have to manage lots of machines (hundreds) 
with lots of software on them (some we write, some third party).  Their 
perspective is that if they want to find a log file it's better for it 
to be where all the log files are instead of trying to find the corner 
of the file system where that particular app is installed.  This appears 
to be the preference of most unix admins (as evidenced by the various 
linux/unix standardization processes).
-- 
Benji York
http://benjiyork.com

From chris at simplistix.co.uk  Fri Mar  9 11:02:21 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 09 Mar 2007 10:02:21 +0000
Subject: [Web-SIG] ConfigParser for configuration
In-Reply-To: <9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com>
References: <45E8EB97.6090805@zetaweb.com>	
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	
	<45E99DC1.4010703@zetaweb.com>	
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>	
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>	
	<45EFDDA5.4010205@simplistix.co.uk>
	<9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com>
Message-ID: <45F130AD.1000904@simplistix.co.uk>

Fred Drake wrote:
> On 3/8/07, Chris Withers <chris at simplistix.co.uk> wrote:
>> I'm glad ZConfig exists.
> 
> Me too, though it does many things differently than if I'd had free reign.

You have free reign now, right? ;-)

>> How does it handle nesting?
> 
> It doesn't, but an application can use explicit references to other
> sections. 

You mean like the format expected by logging.config.fileConfig?

> It doesn't take care of things magically without some
> additional help, for which we've avoided premature abstraction.

Not sure what this means...

Okay, so, say I have a config.ini and I want to have logging sections 
for using in logging.config.fileConfig and other sections for use by my 
app's config.

How would I share the one config file between fileConfig and whatever my 
app uses to tickle ConfigParser? Would each section have to parse the 
file? Would the get confused about keys not designed for them?

Can one config.ini include other .ini files in the same way ZConfig allows?

> The .ini format is working quite well for zc.buildout, I think.  The
> support for layering multiple files is quite nice, and is completely
> explicit.

What is this support for layering multiple files? I couldn't find it 
anywhere in the ConfigParser docs :-S

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk


From chris at simplistix.co.uk  Fri Mar  9 11:05:19 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 09 Mar 2007 10:05:19 +0000
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <20070308110244.56b81bd5@Fenix>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>	<45EDC772.3090803@simplistix.co.uk>	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>	<45EFD930.1040406@simplistix.co.uk>
	<20070308110244.56b81bd5@Fenix>
Message-ID: <45F1315F.4000000@simplistix.co.uk>

Rodrigo Senra wrote:
> [ Chris Withers ]:
>  - multiple Zope instances sharing libraries, python modules,
>    and Zope/Plone Products. These files might be placed out of
>    the instance tree.

Sometimes you want this, sometimes you don't ;-)

You want it if you have lots of homogeneous projects that all use the 
same products and libraries.

For me, it's much more common to need to isolate projects because they 
rely in specific versions of products and libraries and often break if 
they have access to the wrong one...

>  - when the Unix Adm is **not SomeFramework-wise** there is (might be)
>    a demand to keep backup-electable-stuff somewhere he/she/it 
>    wants (like /etc instead of /someApp/etc). Even if with keep
>    the files inside app's tree, deploy scripts might have to create
>    hard links outside that tree. 

OK, this is a good argument for making the location selectable ;-)

>  - one optimization (we actually do) is to create different disk
>    partitions. One optimized for *large* files (like logs and 
>    databases) and other for small files (like source code, libraries
>    and config files).

I've never seen the need myself, what measurable differences has this made?

> In spite of that, I would love to keep deploys *totally* self-contained.
> Nevertheless, I was not wise enough to workaround some of the use cases
> presented above ;o)

Sounds like we really need to support both...

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk


From chris at simplistix.co.uk  Fri Mar  9 11:06:30 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 09 Mar 2007 10:06:30 +0000
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <45F01A22.5090301@benjiyork.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>	<45EDC772.3090803@simplistix.co.uk>	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
	<45F01A22.5090301@benjiyork.com>
Message-ID: <45F131A6.1090903@simplistix.co.uk>

Benji York wrote:
> with lots of software on them (some we write, some third party).  Their 
> perspective is that if they want to find a log file it's better for it 
> to be where all the log files are instead of trying to find the corner 
> of the file system where that particular app is installed. 

Yeah, on the log front I have to agree... I've found myself more often 
just heading to /var/log/<x> instead of wanting to hunt elsewhere...

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk


From jim at zope.com  Fri Mar  9 13:58:06 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 9 Mar 2007 07:58:06 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45EF84AB.3040807@zetaweb.com>
References: <45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<21787a9f0703070408l70baf159r72a4865e19d75cc7@mail.gmail.com>
	<45EF84AB.3040807@zetaweb.com>
Message-ID: <1E39B557-F94D-4BCF-9D05-CE5DBCC76C8B@zope.com>


On Mar 7, 2007, at 10:36 PM, Chad Whitacre wrote:

> James,
>
> Thanks for weighing in.
>
>>> I'd love to get some input who know a lot about what makes
>>> deploying PHP apps so easy.
>>
>> In a past life I had a fair amount of experience working with
>> and deploying PHP, so I'll throw in my $0.02.
>>
>> It's worth pointing out that a lot of the "PHP is easier"
>> perception is largely just that -- a perception.
>
> I don't have tons of PHP experience, but I did just finish
> working on a pretty sizable job, and the deployment was anything
> but easy. Instead it was a brittle amalgam of XML, Apache conf,
> and nasty PHP abstractions. My impression is that PHP is easy for
> simple cases (unpack WordPress and go), but quickly gets ugly
> when you start dealing with frameworks.
>
> So maybe Python is the opposite? Harder for the simple cases, but
> more elegant in the more complicated scenarios.

I don't think this is the case, or, I don't think it has to be.  It  
would be interesting if PHP was simple to deploy for simple  
applications and complex to deploy for complex application.  That  
would inform our discussion quite a bit, IMO, as I think it would be  
far easier for us to make Python easier to install for simple  
applications than it would be for us to make Python easier to install  
for complex applications.  We could bring tools to bear that would be  
appropriate to the problem.  Maybe this would be a good place to  
start. Dang, I wish I had time to.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Fri Mar  9 14:52:38 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 9 Mar 2007 08:52:38 -0500
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <45EF9749.3070002@zetaweb.com>
References: <45EF9749.3070002@zetaweb.com>
Message-ID: <C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>


On Mar 7, 2007, at 11:55 PM, Chad Whitacre wrote:

> All,
>
> Windows
> =======
>
> Sidnei, et al.: your points are well-taken and your expertise
> appreciated. Thanks!
>
>
> pywebd
> ======
>
> Bob: I'm on board with your vision for a common server library
> here. Count me in.
>
>
> webctl/filesystem layout/config syntax
> ======================================
>
> This is looking less hopeful as a place to collaborate:
>
>    - An executable needs a config file on the command line, and/or
>      a config file in a pre-determined place.
>
>    - *Requiring* a config file on the command line is butt-ugly.
>
>    - Our opinions regarding filesystem layout seem to be, um,
>      non-overlapping.

You are missing another alternative.  First, keep in mind that with  
setuptools, "executables" are just wrapper scripts that:

- Set up sys.path

- Import an entry point, and

- Call the entry point

These wrapper scripts are *automatically generated*!

It is just as easy to generate wrapper scripts that pass the name of  
a configuration file to the entry point along with other arguments.   
This is in fact what we're doing.  This means that the script joins  
the software configuration (represented by the entry point and eggs  
used) and the process configuration, represented by the configuration  
file. I'm very happy with how this is working for us.

(If you're interested in the gory details, see:

    http://www.python.org/pypi/zc.zope3recipes

In particular, to see an example of the sort of generated script I'm  
talking about, go to:

   http://www.python.org/pypi/zc.zope3recipes#log-files

and scroll up.)


> I'd like to venture one more round on this, however, before
> giving up on it:
>
>    - It might be the case that Zope only has a few files in an
>      INSTANCE_HOME, but I find myself putting quite a bit in a
>      site's userland:
>
>        - I'll install Python packages in there wholesale, so I get
>          their scripts in bin/, and lots of modules in lib/python.

Many Zope users put Python packages in their instance homes.  I  
mentioned this in another note as a justification for a  
subdirectory.  Personally, since all of the deployments I do are  
large and require multiple instances of the *same* application, I  
prefer to create a separate application installation and than create  
multiple instances of that.  Most Zope users don't seem to need this  
however combine process instances and application instances into the  
single concept of Zope instance.


>
>        - I have multiple configuration files in etc/ (as
>          discussed) along with templates in etc/templates/.

We generally take the view that templates are part of the software  
and are managed in Python packages.

...

>      Keeping it all in svn means that a website is very nearly
>      self-contained and isolated, requiring not much besides
>      Python to be installed in the base system. This is great for
>      many-sites-on-one-server.

Absolutely.  We, of course, check everything into svn.  We (ZC) use  
buildouts to assemble the parts we need, which are typically shared  
across many projects.


>
>    - For one-site-on-many-servers, why does a Unix-y userland for
>      development conflict with deployment?
> That is, why can't a
>      development userland simply be installed into /usr/local for
>      deployment? Surely logging differences could be handled in
>      configuration, no?
>

Because site administrators who actually run the servers and who get  
woken up in the middle of the night when something goes wrong want  
application files to be in standard places, like /etc, /var/log, and  
so on.  These people are not developers.  They are not well served by  
"self-contained" applications, which are, for them at least, only  
part of a much bigger system configuration.

Also note that on multi-core multi-process servers, we have many  
instances of the same application on the same server, so what  
normally gets put in a traditional zope instance is split between an  
application definition and multiple process definitions.  In  
deployment, we install the application definition as an RPM.  We then  
use tools provided by the application definition to create instance  
configurations based on the particular machine's configuration. (For  
ZC, the machine configuration happens to come from a centralized  
database.)

>    - Besides, my proposal only specified two requirements:
>
>        etc/<foo>.conf
>        lib/python
>
>      Is there really a Unix sysadmin that would balk at this?

Yes.


>     This
>      is all that's really needed for a common executable to get
>      your site online. Lay out the rest however you want.

But you don't actually need this at all.

>
>    - Jim, you hold particular distain for lib/python, but it's
>      probably the best example of my "standards enable tools to
>      evolve" argument: lib/python buys you distutils, setuptools,
>      easy_install, workingenv, etc.

No, actually it doesn't.  It is based on an out of date convention.  
workingenv doesn't use it.  It uses lib/pythonx.x. Distutils doesn't  
really use it unless you confider the  --home option (or whatever  
it's called).  Distutls is happy to install almost anywhere using -- 
install-lib. easy_install wants to install into your system Python.   
lib/python is no easier to supply as an alternate install location  
than any other.  lib/python violates "flat is better than nested" by  
introducing a pointless lib directory.

>    - This same principle makes sense of runzope, scriptzope, and
>      debugzope: standardize the file format (= fs layout), and
>      such tools fit perfectly in /usr/local/bin.

Except that this isn't appropriate for deployment.  When you need to  
do something different, system that assume things about file-system  
layout produce gordian knots. I speak from experience from work on  
bending zope installations to the will of the people with the beepers.

>    - Almost all of the Windows discussion has centered on daemons
>      vs. services. Sidnei, et al.: what does a "native" Windows
>      filesystem layout look like for a web application? Is using a
>      self-contained Unix-inspired layout faux pas?
>
>    - As mentioned wrt PHP, users like familiar filesystem layouts.
>      Reaching agreement here improves our story for newcomers.

I don't have a problem with people using whatever layout they want.   
I don't even object to having common layouts that are documented and  
taught.  What I can't accept is a software framework that *requires*  
a particular layout to function.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Fri Mar  9 15:07:47 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 9 Mar 2007 09:07:47 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <45EFD930.1040406@simplistix.co.uk>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
Message-ID: <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com>


On Mar 8, 2007, at 4:36 AM, Chris Withers wrote:

> Jim Fulton wrote:
>> Having everything in one folder is great for development.  It  
>> isn't so good for deployment, at least not on Unix.
>
> Can you explain why?

Yes. See my response to Chad.

> I do a lot of unix deployment, and the thought of a buildout that  
> sprays files all over the system, even if they are in standard unix- 
> y location scares me a lot...

That's because you are a developer.  I've worked for the last couple  
of years with our system administrators supporting major applications  
at Zope Corporation.  For a long time, we did things *our* (the  
developers) way and they lived with it because they had no choice.   
As time wore on and I got to experience more of their pain, I  
realized that maybe they had a clue after all and that If I worked  
with them rather than complacently assuming that they didn't know the  
best way to deploy applications, my life would be easier.

(Side note: Over time, our management has wised up and our SAs have a  
lot more power to tell us, the developers, what to do.  Fortunately,  
over the same period, we have come to appreciate their position and  
so this isn't a problem. :)

>> (I can think of lots of reasons why it wouldn't be great on Wndows  
>> either.)
>
> I'm interested to hear these too since all the microsoft apps I  
> know of tend to have a "one folder" model...

Yeah, that's why I don't use Windows. :)  For years, people word  
files ended up in the same directory with the word applications.   If  
I was a windows server administrator, I would want the software to be  
separate from other artifacts. I'd want to be able to update or  
reinstall the software without losing configuration. I'd want  
configuration data to be managed separately.  This, of course, is  
what the windows registry does.  It puts all of the configuration in  
one place that is separate from the software install.  I'd expect  
logs to be managed separately.

>> single directory containing the few needed files directly.  The  
>> only exception to this for me would be to have a subdirectory for  
>> Python modules, if you have instance specific Python modules.
>
> Indeed. Again, I prefer to have all non-standard-library modules  
> and packages in the instance home, so different versions don't  
> interfere with each other. Yes, this pattern is probably most  
> suited to development environment, but being able to svn the whole  
> instance and just check that out on the production servers is  
> something I personally find very poweful.

>> This is what I do in my latest Zope 3 buildout recipes.
>
> Are those recipes available anywhere?

   http://www.python.org/pypi/zc.zope3recipes

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jim at zope.com  Fri Mar  9 15:14:40 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 9 Mar 2007 09:14:40 -0500
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <45EFDDA5.4010205@simplistix.co.uk>
References: <45E99DC1.4010703@zetaweb.com>
	<45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<45EFDDA5.4010205@simplistix.co.uk>
Message-ID: <5BA15D06-6EF0-4315-9721-9B793D1E03B8@zope.com>


On Mar 8, 2007, at 4:55 AM, Chris Withers wrote:

> Jim Fulton wrote:
>> On Mar 5, 2007, at 4:38 PM, Phillip J. Eby wrote:
>> ...
>>> Personally, I don't care for the Paste Deploy syntax -- frankly   
>>> it's almost barbaric.  :)
>> I don't mean to pick on you, but I really *hate* comments like this.
>
> That's okay ;-)
>
>> criticism.  I'd appreciate it if we would all just ignore  
>> statements  like this and, preferably, stop making them.
>
> ...but I don't think this is. I'd much prefer to hear people's gut  
> feelings, even if they can't justify them.

That's OK over a drink.  In an open discussion it is very very  
counter productive in my experience.

> It all gives indication. Yes, if only one person says "this sucks",  
> then their opinion may not be worth changing the implementation  
> for. However, if 50% of users said "this sucks", even if they  
> couldn't explain why, that'd be something worth worrying about.

Sure, but how do you fix anything if they don't say why it sucks?   
How do you make it better?  How do you even know if they are trying  
to solve the same problem that you are? Or if they've actually tried  
the tool your talking about.

>> The ini/config format is  pretty standard and, IMO, really quite  
>> adequate.
>
> How does it handle nesting?

Using cross-section references.  So, rather than having an embedded  
section, you have an option that refers to another section (or  
collection of sections).

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From sidnei at enfoldsystems.com  Fri Mar  9 15:26:51 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Fri, 9 Mar 2007 11:26:51 -0300
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
	<444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com>
Message-ID: <a7a2b76b0703090626gfb25c8axcf1a0ebe09f150b2@mail.gmail.com>

On 3/9/07, Jim Fulton <jim at zope.com> wrote:
> On Mar 8, 2007, at 4:36 AM, Chris Withers wrote:
> > I'm interested to hear these too since all the microsoft apps I
> > know of tend to have a "one folder" model...
>
> Yeah, that's why I don't use Windows. :)

That's not a good enough excuse. :)

> For years, people word
> files ended up in the same directory with the word applications.

I think that predates my involvement with computers, or you're
misremembering something.

> If I was a windows server administrator, I would want the software to be
> separate from other artifacts.

Log files are usually separate from software. For example on XP, IIS
5.1 logs to C:\WINDOWS\system32\LogFiles. As you've already mentioned
most configuration ends up on the registry. I don't see any mixing of
software and artifacts going on.

> I'd want to be able to update or
> reinstall the software without losing configuration.

Well-behaved software will never touch your configuration. I've
developed several installers using Inno Setup and you always have the
choice to say what files should be deleted or not on an uninstall. The
Zope Installer for  Windows never deletes the INSTANCE_HOME.

> I'd want
> configuration data to be managed separately.  This, of course, is
> what the windows registry does.  It puts all of the configuration in
> one place that is separate from the software install.  I'd expect
> logs to be managed separately.

That's totally fine. It could go to C:\WINDOWS\system32\LogFiles too,
or it could just log to the NT Event Log, and then you can configure
all sorts of things related to for how long those log files are kept.
There are also great tools that allow you to query those logs just
like if they were SQL databases.

-- 
Sidnei da Silva
Enfold Systems                http://enfoldsystems.com
Fax +1 832 201 8856     Office +1 713 942 2377 Ext 214

From fdrake at gmail.com  Fri Mar  9 15:51:52 2007
From: fdrake at gmail.com (Fred Drake)
Date: Fri, 9 Mar 2007 09:51:52 -0500
Subject: [Web-SIG] ConfigParser for configuration
In-Reply-To: <45F130AD.1000904@simplistix.co.uk>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<45EFDDA5.4010205@simplistix.co.uk>
	<9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com>
	<45F130AD.1000904@simplistix.co.uk>
Message-ID: <9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com>

On 3/9/07, Chris Withers <chris at simplistix.co.uk> wrote:
> You have free reign now, right? ;-)

Heh.  Compatibility is worth something, even to me.

> You mean like the format expected by logging.config.fileConfig?

I haven't looked at that in a long time, but I think that's right.
Essentially, each user of configuration data has to know which
portions of their own configuration contains references to other
sections, and then chase those down (or pass them along) to use that
information.  This would take the form of "foramatter =
verbose_formatter" and the [verbose_formatter] would have all the
configuration data for the formatter.

> > It doesn't take care of things magically without some
> > additional help, for which we've avoided premature abstraction.
>
> Not sure what this means...

The application itself has to understand that it's creating an
arbitrarily nested structure from a simple (two-level) hierarchy.  How
that happens is part of the application, not a magical helper library.

> Okay, so, say I have a config.ini and I want to have logging sections
> for using in logging.config.fileConfig and other sections for use by my
> app's config.
>
> How would I share the one config file between fileConfig and whatever my
> app uses to tickle ConfigParser? Would each section have to parse the
> file? Would the get confused about keys not designed for them?

If you really want to use logging.config.fileConfig(), I'd suggest
your app having something like "logging-configuration =
/path/to/logging/config.ini", and using that to call the logging
configuration with the indicated file.

> Can one config.ini include other .ini files in the same way ZConfig allows?

No.

> What is this support for layering multiple files? I couldn't find it
> anywhere in the ConfigParser docs :-S

What this needs to be depends on the application.  There's a simple
layering included in ConfigParser (call read() with multiple
filenames, or readfp() more than once), but that doesn't serve
zc.buildout well.  You can look in the zc.buildout documentation and
code for what that does; look for "extends".


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Every sin is the result of a collaboration." --Lucius Annaeus Seneca

From ianb at colorstudy.com  Fri Mar  9 17:18:50 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 09 Mar 2007 10:18:50 -0600
Subject: [Web-SIG] ConfigParser
Message-ID: <45F188EA.3070703@colorstudy.com>

Since there's lots of talk of ConfigParser, I thought I'd note some code 
  I've written that uses the basic API of ConfigParser but allows for 
some simple additions; in INITools (http://pythonpaste.org/initools/) 
specifically initools.configparser: 
http://pythonpaste.org/initools/initools/configparser.py.html

It keeps track of filenames and line numbers so it's possible to give 
more detailed error messages (though only if you have access to the 
underlying config parser object), and though not enabled by default it 
also includes options for things like "extends" to overlap sections, and 
${section:value} substitution.  Unlike some of the other ConfigParser 
alternatives out there, it doesn't extend the ini syntax or the types 
that ini files deal in (i.e., only strings).

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From chad at zetaweb.com  Fri Mar  9 19:22:58 2007
From: chad at zetaweb.com (Chad Whitacre)
Date: Fri, 09 Mar 2007 13:22:58 -0500
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
References: <45EF9749.3070002@zetaweb.com>
	<C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
Message-ID: <45F1A602.3000104@zetaweb.com>

Jim,

First, your comments re: paying attention to sysadmins are 
well-taken. Thanks.


 > I don't have a problem with people using whatever layout they
 > want. I don't even object to having common layouts that are
 > documented and taught. What I can't accept is a software
 > framework that *requires* a particular layout to function.

That's fair enough. So what if a proposed common executable acted 
like this:

   1. A configuration file may be specified on the command line.

   2. If no config file is named on the command line, then look
      for one in certain locations:

        /etc/<foo>.conf
        /usr/local/etc/<foo>.conf
        ~/etc/<foo>.conf
        ./etc/<foo>.conf

   3. <foo>.conf does basic process config (address, user/group,
      threads, etc.) and hands off to a second-layer config (be it
      paste.ini, zope.conf, etc.)

   4. The following are added to PYTHONPATH *if they exist*:

        ./lib/python2.x
        ./lib/python2.x/site-packages
        ./lib/python
        ./lib/python/site-packages


Such an executable would satisfy me. Would it be flexible enough 
to meet your requirements?


chad

From jbauer at rubic.com  Fri Mar  9 20:30:12 2007
From: jbauer at rubic.com (Jeff Bauer)
Date: Fri, 09 Mar 2007 13:30:12 -0600
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <45F1A602.3000104@zetaweb.com>
References: <45EF9749.3070002@zetaweb.com>	<C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
	<45F1A602.3000104@zetaweb.com>
Message-ID: <45F1B5C4.4000808@rubic.com>

Chad Whitacre wrote:
>    2. If no config file is named on the command line, then look
>       for one in certain locations:
> 
>         /etc/<foo>.conf
>         /usr/local/etc/<foo>.conf
>         ~/etc/<foo>.conf
>         ./etc/<foo>.conf

And possibly the current working directory:  ./<foo>.conf

--
Jeff Bauer
Rubicon, Inc.

From jim at zope.com  Fri Mar  9 21:02:23 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 9 Mar 2007 15:02:23 -0500
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <a7a2b76b0703090626gfb25c8axcf1a0ebe09f150b2@mail.gmail.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
	<444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com>
	<a7a2b76b0703090626gfb25c8axcf1a0ebe09f150b2@mail.gmail.com>
Message-ID: <B3533F6B-7353-4140-A947-1073055642FA@zope.com>


On Mar 9, 2007, at 9:26 AM, Sidnei da Silva wrote:

> On 3/9/07, Jim Fulton <jim at zope.com> wrote:
>> On Mar 8, 2007, at 4:36 AM, Chris Withers wrote:

...

>> For years, people word
>> files ended up in the same directory with the word applications.
>
> I think that predates my involvement with computers, or you're
> misremembering something.

Kids these days!  I'm not misremembering.

>> If I was a windows server administrator, I would want the software  
>> to be
>> separate from other artifacts.
>
> Log files are usually separate from software. For example on XP, IIS
> 5.1 logs to C:\WINDOWS\system32\LogFiles. As you've already mentioned
> most configuration ends up on the registry. I don't see any mixing of
> software and artifacts going on.

I cleverly distracted with you with my snipe at windows. Bwahaha.

The origin of this particular point was Chris saying that he thought  
single directory layouts worked for deployment on all platforms. I  
suggested that a professional Windows server administrator  wouldn't  
like things in one directory.  My point is that, as with Unix,  
software deployed on Windows separates configuration from logging,  
from software and so on.  A normal windows application doesn't keep  
everything in one directory as we do on Windows.

Jim
--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jinty at web.de  Mon Mar 12 06:41:43 2007
From: jinty at web.de (Brian Sutherland)
Date: Mon, 12 Mar 2007 06:41:43 +0100
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <45F1A602.3000104@zetaweb.com>
References: <45EF9749.3070002@zetaweb.com>
	<C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
	<45F1A602.3000104@zetaweb.com>
Message-ID: <20070312054143.GC5066@minipas.home>

On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote:
> That's fair enough. So what if a proposed common executable acted 
> like this:
> 
>    1. A configuration file may be specified on the command line.

+lots

>    2. If no config file is named on the command line, then look
>       for one in certain locations:
> 
>         /etc/<foo>.conf
>         /usr/local/etc/<foo>.conf
>         ~/etc/<foo>.conf
>         ./etc/<foo>.conf

Perhaps you might want to think about /etc/<foo>/<foo>.conf, because
applications generally grow config files. In this case the second-layer
config.

Postgresql even does:

/etc/postgresql/${version}/${instance_name}/

So you can have many instances of many versions running at once. That
makes upgrading much easier.

>    3. <foo>.conf does basic process config (address, user/group,
>       threads, etc.) and hands off to a second-layer config (be it
>       paste.ini, zope.conf, etc.)

Perhaps specify the second-layer config file location in the first layer
config.

>    4. The following are added to PYTHONPATH *if they exist*:
> 
>         ./lib/python2.x
>         ./lib/python2.x/site-packages
>         ./lib/python
>         ./lib/python/site-packages

-1

Why not just write additional PYTHONPATH locations into the script when
you create it? The thing that creates the executable should know where
its putting the libraries.

> Such an executable would satisfy me. Would it be flexible enough 
> to meet your requirements?
> 
> 
> 
> 
> chad
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/jinty%40web.de
> 

-- 
Brian Sutherland

From jinty at web.de  Mon Mar 12 06:26:40 2007
From: jinty at web.de (Brian Sutherland)
Date: Mon, 12 Mar 2007 06:26:40 +0100
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <45F1A602.3000104@zetaweb.com>
References: <45EF9749.3070002@zetaweb.com>
	<C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
	<45F1A602.3000104@zetaweb.com>
Message-ID: <20070312052640.GB5066@minipas.home>

On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote:
> Jim,
> 
> First, your comments re: paying attention to sysadmins are 
> well-taken. Thanks.

I was pointed to this conversation and would like to comment wearing my
sysadmin hat about what I would like. How I think web applications
should be installed on unix. Basically, I'll just go through what
happens when I install apache, squid or postgres on linux.

When I install an application that is a daemon, I want the following
things to happen automatically:
    * A new user for the daemon to run as is created to protect the
      daemon from the other users and the other users from the system.
    * A default config is placed unless one already exists in
      /etc/<application>/*.conf
    * Directories are laid out according to the FHS,
      _with_the_correct_permissions_.
    * Logrotate config placed in /etc/logrotate.d/
    * Initscripts placed in /etc/init.d and symlinked to /etc/rc*.d
    * Server is started
    * Hopefully logging is via syslog with reasonable rules in
      /etc/logcheck
    * SEL Policy (perhaps in future)
    * Upgrades from previous versions handled
    * Various other files placed around /etc

And, when I de-install, I want all of these things cleaned up in the
right way according to my specific flavor of Linux. Currently when
installing Zope, because of the way the instance model is hardwired, I
have to do a lot of these things manually.  That's bad when you are
working on a cluster of many hopefully identical machines.

By now, it should be obvious that the details of this process are
specific to my favorite distribution of Linux and that I install this as
a sysadmin.  Things are different if you are a developer, running BSD,
or running Windows.

Also, the infrastructure for doing all these things at install/deinstall
time already exists in the packaging infrastructure of most Linux
distributions. I think it would be a bad idea to duplicate this
infrastructure and all it's os-specific variations in a pure python
packaging infrastructure.

At the moment, distutils and setuptools are the main interfaces between
the packaging infrastructure and python applications. Buried deep in
most packages is the line:

    pythonX.Y setup.py install --single-version-externally-managed --root=./<foo>

So, basically, I think that keeping sysadmins happy means maintaining
compatibility/extending a distutils style installation.

-- 
Brian Sutherland

From jim at zope.com  Mon Mar 12 15:01:20 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 12 Mar 2007 10:01:20 -0400
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <20070312052640.GB5066@minipas.home>
References: <45EF9749.3070002@zetaweb.com>
	<C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
	<45F1A602.3000104@zetaweb.com> <20070312052640.GB5066@minipas.home>
Message-ID: <376AD825-1180-4595-BFB5-49D5E78C34D6@zope.com>


On Mar 12, 2007, at 1:26 AM, Brian Sutherland wrote:

> On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote:
>> Jim,
>>
>> First, your comments re: paying attention to sysadmins are
>> well-taken. Thanks.
>
> I was pointed to this conversation and would like to comment  
> wearing my
> sysadmin hat about what I would like. How I think web applications
> should be installed on unix. Basically, I'll just go through what
> happens when I install apache, squid or postgres on linux.
>
> When I install an application that is a daemon,

There is an interesting subtlety here.  I think of Zope (or  
applications built using Zope components) as applications that can be  
run as one or more daemons.  To me, a daemon is a particular instance  
of an application, not the application itself.  I (and my SAs) prefer  
to separate software installation from configuration.  We prefer that  
these be 2 steps.  We often run multiple daemons of the same  
application on a single machine. The configuration of these daemons  
(and cron jobs, and so on) are controlled from a central  
configuration database that is mostly independent of the software  
install.  We don't want deamons installed automatically when an  
application is installed.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


From jinty at web.de  Mon Mar 12 15:35:45 2007
From: jinty at web.de (Brian Sutherland)
Date: Mon, 12 Mar 2007 15:35:45 +0100
Subject: [Web-SIG] windows, pywebd, webctl
In-Reply-To: <376AD825-1180-4595-BFB5-49D5E78C34D6@zope.com>
References: <45EF9749.3070002@zetaweb.com>
	<C1A9F3A6-5574-4AAE-9AE1-31576B9F9BE1@zope.com>
	<45F1A602.3000104@zetaweb.com> <20070312052640.GB5066@minipas.home>
	<376AD825-1180-4595-BFB5-49D5E78C34D6@zope.com>
Message-ID: <20070312143545.GB4923@minipas.home>

On Mon, Mar 12, 2007 at 10:01:20AM -0400, Jim Fulton wrote:
> 
> On Mar 12, 2007, at 1:26 AM, Brian Sutherland wrote:
> 
> > On Fri, Mar 09, 2007 at 01:22:58PM -0500, Chad Whitacre wrote:
> >> Jim,
> >>
> >> First, your comments re: paying attention to sysadmins are
> >> well-taken. Thanks.
> >
> > I was pointed to this conversation and would like to comment  
> > wearing my
> > sysadmin hat about what I would like. How I think web applications
> > should be installed on unix. Basically, I'll just go through what
> > happens when I install apache, squid or postgres on linux.
> >
> > When I install an application that is a daemon,
> 
> There is an interesting subtlety here.  I think of Zope (or  
> applications built using Zope components) as applications that can be  
> run as one or more daemons.  To me, a daemon is a particular instance  
> of an application, not the application itself.  I (and my SAs) prefer  
> to separate software installation from configuration.  We prefer that  
> these be 2 steps.  We often run multiple daemons of the same  
> application on a single machine. The configuration of these daemons  
> (and cron jobs, and so on) are controlled from a central  
> configuration database that is mostly independent of the software  
> install.  We don't want deamons installed automatically when an  
> application is installed.

Then perhaps you are more interested in a structure like the one
postgresql uses, where there is a namespace in /etc and /var/lib for the
specific instance of postgres. All, however, are run as the same system
user. 

Also, I'll note that a well designed packaging system should _never_
blindly overwrite already existing files in /etc, so I would implement
your case as:

* Install predefined configuration files in /etc
* Install daemon package

-- 
Brian Sutherland

From chris at simplistix.co.uk  Tue Mar 13 14:58:13 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 13 Mar 2007 13:58:13 +0000
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk>
	<444CE28D-7E2E-478F-9B90-9DD76109FF24@zope.com>
Message-ID: <45F6ADF5.2090905@simplistix.co.uk>

Jim Fulton wrote:
> 
>> I do a lot of unix deployment, and the thought of a buildout that 
>> sprays files all over the system, even if they are in standard unix-y 
>> location scares me a lot...
> 
> That's because you are a developer. 

OK, I see what you mean now, although I think it's clear that whatever 
choices we make, they should (easily) allow both models...

>>> This is what I do in my latest Zope 3 buildout recipes.
>>
>> Are those recipes available anywhere?
> 
>   http://www.python.org/pypi/zc.zope3recipes

Great, thanks :-)

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

From chris at simplistix.co.uk  Tue Mar 13 15:08:18 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 13 Mar 2007 14:08:18 +0000
Subject: [Web-SIG] ConfigParser for configuration
In-Reply-To: <9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com>
References: <45E8EB97.6090805@zetaweb.com>	
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	
	<45E99DC1.4010703@zetaweb.com>	
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>	
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>	
	<45EFDDA5.4010205@simplistix.co.uk>	
	<9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com>	
	<45F130AD.1000904@simplistix.co.uk>
	<9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com>
Message-ID: <45F6B052.9040903@simplistix.co.uk>

Fred Drake wrote:
> On 3/9/07, Chris Withers <chris at simplistix.co.uk> wrote:
>> You have free reign now, right? ;-)
> 
> Heh.  Compatibility is worth something, even to me.

Oh just BBB it ;-)

> The application itself has to understand that it's creating an
> arbitrarily nested structure from a simple (two-level) hierarchy.  How
> that happens is part of the application, not a magical helper library.

Funny, I always appreciated the help from the not-so-magical library. 
Saves a lot of wheel re-inventing when doing config for various projects...

> If you really want to use logging.config.fileConfig(), I'd suggest
> your app having something like "logging-configuration =
> /path/to/logging/config.ini", and using that to call the logging
> configuration with the indicated file.

OK.

>> Can one config.ini include other .ini files in the same way ZConfig 
>> allows?
> 
> No.

:-/

> What this needs to be depends on the application.  There's a simple
> layering included in ConfigParser (call read() with multiple
> filenames, or readfp() more than once), but that doesn't serve
> zc.buildout well.  You can look in the zc.buildout documentation and
> code for what that does; look for "extends".

Ah, I think I'm getting the picture now.
So, basically, everything ends up in one dictionary, and you need to be 
careful nothing re-uses a key?

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

From chris at simplistix.co.uk  Tue Mar 13 15:15:18 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 13 Mar 2007 14:15:18 +0000
Subject: [Web-SIG] more comments on Paste Deploy
In-Reply-To: <5BA15D06-6EF0-4315-9721-9B793D1E03B8@zope.com>
References: <45E99DC1.4010703@zetaweb.com>
	<45E8EB97.6090805@zetaweb.com>	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>	<45E99DC1.4010703@zetaweb.com>	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<45EFDDA5.4010205@simplistix.co.uk>
	<5BA15D06-6EF0-4315-9721-9B793D1E03B8@zope.com>
Message-ID: <45F6B1F6.9080109@simplistix.co.uk>

Jim Fulton wrote:
> 
>> It all gives indication. Yes, if only one person says "this sucks", 
>> then their opinion may not be worth changing the implementation for. 
>> However, if 50% of users said "this sucks", even if they couldn't 
>> explain why, that'd be something worth worrying about.
> 
> Sure, but how do you fix anything if they don't say why it sucks?  How 
> do you make it better?  How do you even know if they are trying to solve 
> the same problem that you are? Or if they've actually tried the tool 
> your talking about.

These are all good points and they're the tough ones to answer. I've 
often found people are justified in their opinions even if they can't 
find a way to communicate the reasons for those opinions...

>>> The ini/config format is  pretty standard and, IMO, really quite 
>>> adequate.
>>
>> How does it handle nesting?
> 
> Using cross-section references.  So, rather than having an embedded 
> section, you have an option that refers to another section (or 
> collection of sections).

I finally get this now :-)

I do still worry about trying to figure out who's using what key (in 
terms of config files with sections for more than one type of 
configuration in them, as ZConfig provides).

Am I right in thinking the way to avoid this in ConfigParser is to have 
one file that references lots of other files? eg:

[config]
logging=logging.ini
zodb=zodb.ini
...etc..

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

From fdrake at gmail.com  Tue Mar 13 15:38:26 2007
From: fdrake at gmail.com (Fred Drake)
Date: Tue, 13 Mar 2007 09:38:26 -0500
Subject: [Web-SIG] ConfigParser for configuration
In-Reply-To: <45F6B052.9040903@simplistix.co.uk>
References: <45E8EB97.6090805@zetaweb.com>
	<51801691-DF61-4E36-9E89-D6C62EFF98F9@zope.com>
	<45E99DC1.4010703@zetaweb.com>
	<5.1.1.6.0.20070305121722.0237efd8@sparrow.telecommunity.com>
	<363F723A-92A8-446C-8344-5C3E32101FEB@zope.com>
	<45EFDDA5.4010205@simplistix.co.uk>
	<9cee7ab80703080430yc62a8bch3f3e08462d7f5792@mail.gmail.com>
	<45F130AD.1000904@simplistix.co.uk>
	<9cee7ab80703090651o2d53c518pff4d52589bd97990@mail.gmail.com>
	<45F6B052.9040903@simplistix.co.uk>
Message-ID: <9cee7ab80703130738l64c805b8ubf76459b9f32c821@mail.gmail.com>

On 3/13/07, Chris Withers <chris at simplistix.co.uk> wrote:
> So, basically, everything ends up in one dictionary, and you need to be
> careful nothing re-uses a key?

The result is (essentially) a dictionary of dictionaries, so no,
there's no worry about overlapping keys across sections.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Every sin is the result of a collaboration." --Lucius Annaeus Seneca

From ianb at colorstudy.com  Tue Mar 13 20:47:54 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 13 Mar 2007 14:47:54 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
	Middleware
In-Reply-To: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
Message-ID: <45F6FFEA.9080007@colorstudy.com>

Phillip J. Eby wrote:
>>  basically, where each object type results in a new key in the 
>> environment and a new ad hoc specification to be made (e.g., 
>> wsgi.file_wrapper takes a block size, which is specific only to that 
>> case).
> 
> Right.  I'm specifically saying that a collection of individual 
> specifications is much *better* than a single overarching specification 
> generalized from a single example.  Single use cases make bad general 
> specs.
> 
> 
>> OK, the dict would avoid multiple different kinds of keys, and 
>> presumably they'd all have the same signature.  Block size doesn't 
>> really make any sense to me as a common parameter.  Content type 
>> should be a common parameter, as something like an lxml object can be 
>> serialized as either XML or HTML.  I don't think any response headers 
>> are likely to effect the serialization... though with my specification 
>> that remains an application concern, so it doesn't have to be resolved 
>> in the specification.
> 
> Please don't keep trying to generalize this.  They're called 
> "specific-ations", not "general-izations".  :)
> 
> 
>>> Notice that this approach doesn't require any special protocol for 
>>> these wrappers -- just WSGI.  It's simpler to specify, and simpler to 
>>> implement than what you propose, while addressing some of the open 
>>> issues.
>>
>> The specification isn't particularly long or complicated, IMHO.
> 
> That's because it doesn't address any of the real issues -- they're all 
> deferred to your "open issues" section.  That's why I don't think having 
> the specification adds any value over highlighting the existing WSGI 
> pattern for extending the response (i.e. server-supplied 
> iterator-wrappers).

The open issues section has three issue.  One is a matter of defining 
some naming convention, and as long as people *try* to match up their 
conventions it will work.  The second has a proposed solution.  The last 
is merely aesthetic.

These are the "real issues" you are referring to?

>> When playing with implementation I used type names, and actually I 
>> rather prefer them, but it's not always clear what name to use.  For 
>> instance, "lxml", "lxml.etree", "lxml.etree.Element", and 
>> "lxml.etree._Element" all are reasonable names.  Or "ElementTree", 
>> "ElementTree.Element", "ElementTree._Element", "xml.etree", 
>> "xml.etree.Element", and "xml.etree._Element".  Or even something like 
>> "IElement" could make sense in some context (e.g., what if you can 
>> accept the overlapping interfaces of both lxml and ElementTree?)
>>
>> At least the actual type object seems easy enough.  OTOH, there are 
>> actually cases when I'd like to say that I could accept a certain type 
>> without having to import the type.  E.g., if I wanted to do an XSLT 
>> transformation, I *could* support several kinds of objects without 
>> requiring any of them (e.g., lxml, 4DOM, and Genshi Markup).
> 
> These problems all stem from premature generalization.  It's a trivial 
> problem to fix, however, if you are trying to share one particular 
> content type: just pick a key and use it!

That's not much easier, really.  It would still be documented, still 
needs to be implemented and defined properly.  The biggest difference is 
that it needs to be done again for each type of object.

> Libraries such as wsgiref can support this pattern by providing a 
> utility like "wrap_content(environ, content, default_wrapper, *keys)" 
> function that looks up "keys" to find a wrapper to use in place of the 
> default_wrapper.
> 
> 
>>>> The same things apply to the parsing of ``wsgi.input``, specifically
>>>> parsing form data.  A similar strategy is presented to avoid
>>>> unnecessarily reparsing that data.
>>> I would rather offer an optional 'get_file_storage()' method or some 
>>> such as a blessed WSGI extension, than have such an open-ended "get 
>>> whatever you want from the input object" concept floating around.  A 
>>> strategy which reinvents half of PEP 246 (the *old* PEP 246, before 
>>> it became almost as complicated as WSGI) seems like overkill to me.
>>
>> I don't really understand what you are proposing.
> 
> That wsgi.input be allowed to have a 'get_file_storage()' method that 
> can be called by applications, and that calling it means the input 
> stream must not have been read and will no longer be readable.
> 
> 
>> This part addresses the same issues as presented in 
>> http://wsgi.org/wsgi/Specifications/handling_post_forms
>>
>> I really don't *want* to write every wsgi.input to a temporary file 
>> just because someone else *might* want to reparse the input.  I'd much 
>> rather do it lazily, as 99% of the time reparsing won't happen.
> 
> I don't understand your complaint, as it seems unrelated to what I propose.

I didn't understand what you were proposing, I think.  I still don't 
really know what get_file_storage means.

>>>> Other Possibilities
>>>> -------------------
>>>>
>>>> * You could simply parse everything ever time.
>>>> * You could pass data through callbacks in the environment (but this 
>>>> can
>>>> break non-aware middleware).
>>>> * You can make custom methods and keys for each case.
>>>> * You can use something other than WSGI.
>>> And you can use the established WSGI method for adding semantics to a 
>>> response, using a middleware-supplied wrapper.  I think this is 
>>> actually the best alternative.
>>
>> I really don't understand the advantage.
> 
> It's simple: *specifications are a liability in the general case*.  They 
> are supposed to be the record of negotiations between people who need to 
> co-operate, not an attempt to solve all possible problems.

This certainly doesn't solve all possible problems, it only addresses 
one particular issue.

> So, if your spec is only about how relatively tight-coupled WFC's (WSGI 
> framework components) talk to each other, it seems more properly the 
> business of a web framework, not WSGI.

Most of the places I want to use this are *not* at the framework level. 
  A simple example is just parsing form data without having to own the 
data, which is an outstanding issue with WSGI stacks, and can be done 
outside of a framework.  Another is how to communicate non-string data 
while having graceful fallback for string data.  This is of particular 
interest to me, as I turn WSGI into HTTP quite often, and there's 
definitely nothing but strings at that point.

> However, it *is* WSGI (wsgi-onic?) for the authors of certain components 
> to get together and say, "hey let's agree on this wrapper protocol"...  
> or better yet, a wrapper *implementation*.
> 
> This is way way better than having another spec.  Every godforsaken new 
> spec attached to WSGI just makes the whole thing seem way too 
> complicated.  In retrospect, I wish I hadn't supported some of the 
> options and doodads and whatnots that are in WSGI today.  If I had it to 
> do over, WSGI would be a lot simpler.

This is a wsgiorg. specification, not a wsgi., and it's not meant to 
solve all issues.  It is meant to be implementation neutral.

> However, it's not too late to stop adding new cruft -- and I consider 
> the idea of reinventing PEP 246 inside of WSGI to be cruft of a most 
> horrible kind.
> 
> 
>> Best practice is fine, though of course still needs to be documented, 
>> as this is hardly a practice that people would naturally think about 
>> or implement.
> 
> Well, it's in PEP 333.

It's a nice idea, but as far as I know no one has actually used 
wsgi.file_wrapper.  Though so far no one has paid very close attention 
to these kinds of performance issues either.  I think using it in a 
useful way requires platform-specific twiddling that no one cares to do.

>>   But I don't really think that practice would be any simpler or 
>> easier to describe if done completely.  In fact, I think it would take 
>> exactly the same amount of space to describe.
> 
> Even if it *did*, it'd still be better.  However, since it's not a spec, 
> it can be presented informally.  Here's an example:
> 
> "If you want to give applications underneath your middleware a chance to 
> return rich responses (i.e., objects instead of strings), follow the 
> pattern used for the WSGI 'file wrapper' object.  That is, have your 
> server or middleware add an environ key with a wrapper API that can 
> convert the richer objects you're expecting into a standard WSGI 
> iterator.  Then, your server can simply inspect the iterators it 
> receives to see if they are instances of your wrapper type, and pull out 
> the objects you want.  In this way, if there is middleware between you 
> and the application returning the rich response that modifies the 
> response body, you will receive an iterator of a different type, which 
> you can process in the usual way.  However, if you receive an instance 
> of your wrapper type, you will know that you can access the rich data 
> directly."
> 
> Now, can you expand this into more of a tutorial, give more hints and so 
> on?  Absolutely.  It'd be a great idea to.  But the basic idea is simple 
> and doesn't require rigorous definitions -- it just needs people to 
> publish what keys they're using and the *specifications thereof*.
> 
> What you're trying to specify is effectively a *meta*-specification: 
> much more difficult to do well, and not nearly as useful to have in this 
> case.

Except insofar as "type" is variable in my specification, I don't think 
it is substantially different.

If no one cares about this, then I guess I can just put it under the 
httpencode namespace where it was before, but I don't see any reason to 
make it less general.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From pje at telecommunity.com  Tue Mar 13 21:12:43 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 13 Mar 2007 15:12:43 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
 Middleware
In-Reply-To: <45F6FFEA.9080007@colorstudy.com>
References: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>

At 02:47 PM 3/13/2007 -0500, Ian Bicking wrote:
>The open issues section has three issue.  One is a matter of defining some 
>naming convention, and as long as people *try* to match up their 
>conventions it will work.  The second has a proposed solution.  The last 
>is merely aesthetic.
>
>These are the "real issues" you are referring to?

No - I'm saying that the real issues are all (and always) specific to the 
particular data type being exchanged.


>That's not much easier, really.  It would still be documented, still needs 
>to be implemented and defined properly.  The biggest difference is that it 
>needs to be done again for each type of object.

It has to be anyway.


>I didn't understand what you were proposing, I think.  I still don't 
>really know what get_file_storage means.

It would return a cgi.file_storage encoding the request body.


>It's a nice idea, but as far as I know no one has actually used 
>wsgi.file_wrapper.

I believe that the Jython WSGI implementation provides one, or something 
analagous that wraps certain types of Java stream objects.


>Except insofar as "type" is variable in my specification, I don't think it 
>is substantially different.

That is indeed the substance of the difference - yours is a 
meta-specification, rather than a specification.  As a result, it's more 
complicated to grasp than a pattern...  and significantly more difficult to 
get *right*.  And without examples, it's basically impossible to get right.


>If no one cares about this, then I guess I can just put it under the 
>httpencode namespace where it was before, but I don't see any reason to 
>make it less general.

It'll be worth making it general when there are more examples of the 
pattern to generalize from.  As you pointed out yourself, there are very 
few at the moment.


From ianb at colorstudy.com  Tue Mar 13 21:14:37 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 13 Mar 2007 15:14:37 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
	Middleware
In-Reply-To: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
Message-ID: <45F7062D.6060209@colorstudy.com>

Phillip J. Eby wrote:
>> I didn't understand what you were proposing, I think.  I still don't 
>> really know what get_file_storage means.
> 
> It would return a cgi.file_storage encoding the request body.

I still don't understand.  Are you talking about cgi.FieldStorage?  Are 
you talking about an implementation of something, or something in the 
environment?

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From pje at telecommunity.com  Tue Mar 13 21:34:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 13 Mar 2007 15:34:40 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
 Middleware
In-Reply-To: <45F7062D.6060209@colorstudy.com>
References: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com>

At 03:14 PM 3/13/2007 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>>I didn't understand what you were proposing, I think.  I still don't 
>>>really know what get_file_storage means.
>>It would return a cgi.file_storage encoding the request body.
>
>I still don't understand.  Are you talking about cgi.FieldStorage?

Oops.  Yeah.  That should be get_field_storage(), then.  D'oh.  Sorry about 
that.  Obviously it's been a while since I've used one of thos directly.  :)


From ianb at colorstudy.com  Tue Mar 13 22:15:46 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 13 Mar 2007 16:15:46 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
 Middleware
In-Reply-To: <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com>
Message-ID: <45F71482.3050204@colorstudy.com>

Phillip J. Eby wrote:
> At 03:14 PM 3/13/2007 -0500, Ian Bicking wrote:
>> Phillip J. Eby wrote:
>>>> I didn't understand what you were proposing, I think.  I still don't 
>>>> really know what get_file_storage means.
>>> It would return a cgi.file_storage encoding the request body.
>>
>> I still don't understand.  Are you talking about cgi.FieldStorage?
> 
> Oops.  Yeah.  That should be get_field_storage(), then.  D'oh.  Sorry 
> about that.  Obviously it's been a while since I've used one of thos 
> directly.  :)

OK, we're getting closer, but I'm *still* not entirely sure what you are 
proposing.  Are you talking about adding a function to wsgiref that 
either parses the input with cgi.FieldStorage, or gets an existing 
parsed value?

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org

From pje at telecommunity.com  Tue Mar 13 22:38:44 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 13 Mar 2007 16:38:44 -0500
Subject: [Web-SIG] Proposal: Avoiding Serialization When Stacking
 Middleware
In-Reply-To: <45F71482.3050204@colorstudy.com>
References: <5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306213018.04c0b558@sparrow.telecommunity.com>
	<5.1.1.6.0.20070306232230.02b101d8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313150439.027d0cf8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070313153400.027f5ab8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070313163754.027c8e28@sparrow.telecommunity.com>

At 04:15 PM 3/13/2007 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>At 03:14 PM 3/13/2007 -0500, Ian Bicking wrote:
>>>Phillip J. Eby wrote:
>>>>>I didn't understand what you were proposing, I think.  I still don't 
>>>>>really know what get_file_storage means.
>>>>It would return a cgi.file_storage encoding the request body.
>>>
>>>I still don't understand.  Are you talking about cgi.FieldStorage?
>>Oops.  Yeah.  That should be get_field_storage(), then.  D'oh.  Sorry 
>>about that.  Obviously it's been a while since I've used one of thos 
>>directly.  :)
>
>OK, we're getting closer, but I'm *still* not entirely sure what you are 
>proposing.  Are you talking about adding a function to wsgiref that either 
>parses the input with cgi.FieldStorage, or gets an existing parsed value?

I was talking about defining a standard WSGI extension whereby the 
wsgi.input object could have such a method.


From rodsenra at gpr.com.br  Fri Mar 16 15:46:56 2007
From: rodsenra at gpr.com.br (Rodrigo Senra)
Date: Fri, 16 Mar 2007 11:46:56 -0300
Subject: [Web-SIG] [Proposal] "website" and first-level conf
In-Reply-To: <45F1315F.4000000@simplistix.co.uk>
References: <f593a5ce0703032027h3e2e3acbn7f8299710d10afa8@mail.gmail.com>
	<35EED55B-5CD8-47F4-A434-7343282E443D@zope.com>
	<45EDC772.3090803@simplistix.co.uk>
	<321B3A5C-33CB-4B02-86F3-01FE2C801A3D@zope.com>
	<45EFD930.1040406@simplistix.co.uk> <20070308110244.56b81bd5@Fenix>
	<45F1315F.4000000@simplistix.co.uk>
Message-ID: <20070316114656.281c02e0@Fenix>


|Rodrigo Senra :
|>  - multiple Zope instances sharing libraries, python modules,
|>    and Zope/Plone Products. These files might be placed out of
|>    the instance tree.
[ Chris Withers ]:
|Sometimes you want this, sometimes you don't ;-)

Indeed.

|Rodrigo Senra :
|>  - one optimization (we actually do) is to create different disk
|>    partitions. One optimized for *large* files (like logs and 
|>    databases) and other for small files (like source code, libraries
|>    and config files).
[ Chris Withers ]:
|I've never seen the need myself, what measurable differences has this
|made?

I do not have quantitative results since I have done that separation
from the start. But, since there are file systems optimized for a few
large files and others for many small ones, it makes sense to trust
FS people and make use of that ;o) Nevertheless, I see your point that
without measurements it "might" no  be worth the trouble.
On the other hand, if you plan your partitons prior to any software
installation the overhead is minimal and any (unmeasured) performance
benefit if for free <0.5wink>.

|Rodrigo Senra :
|> In spite of that, I would love to keep deploys *totally*
|> self-contained. Nevertheless, I was not wise enough to workaround
|> some of the use cases presented above ;o)
[ Chris Withers ]:
|Sounds like we really need to support both...

+1

Abra??o,
Senra

-------------
Rodrigo Senra
GPr Sistemas 
http://www.gpr.com.br

From graham.dumpleton at gmail.com  Wed Mar 21 11:36:07 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Wed, 21 Mar 2007 21:36:07 +1100
Subject: [Web-SIG] Direct use of sys.stdout,
	sys.stderr and sys.stdin in WSGI application.
Message-ID: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>

When one is using CGI as a means of implementing a WSGI application,
although one would return content through the iterable returned from
the application or by calling write() method returned from
start_response(), one could actually write to sys.stdout directly as
well since that is where the WSGI adapter writes it to anyway.

Obviously this isn't something that should be done but then the WSGI
PEP doesn't say anything about code not writing to sys.stdout and more
than likely at some point someone is going to think they can just use
'print' to have some debugging statements output where they think they
will see them. In the case of CGI such output would wrongly end up in
the response and screw things up.

To clarify this, a future update to WSGI specification or this
environment specification people have been talking about, should
perhaps clarify what behaviour one can expect out of sys.stdin,
sys.stdout and sys.stderr.

In the case of sys.stdout, do people see it as being at least good
practice, if not required by specification, that the WSGI adapter
should ensure that sys.stdout cannot be written to directly or by
using 'print' from a WSGI application. Thus, in a CGI adapter it would
do something like:

  import sys

  class dummystdout:
    def write(self, *args):
      raise IOError("WSGI prohibits use of sys.stdout.")
    ....

  def run_with_cgi(application):
    ...

    stdout = sys.stdout
    sys.stdout = dummystdout()

    ...

    def write(data):
      ...
      stdout.write(data)
      stdout.flush()

In other words, it saves a reference to sys.stdout for its own use and
then replaces sys.stdout with a dummy file like object that raises an
exception if written to in any way or flushed.

Even in Apache where sys.stdout (if flushed) eventually makes its way
to the Apache error log, it seems it would also be a good idea to
disable sys.stdout. The idea here is that if all WSGI adapters ensured
that sys.stdout wasn't usable you would reduce the possibility of
someones code inadvertently using it with one server and have it
seemingly work and then move to CGI and find it screws everything up.
Thus we are sort of protecting people by locking down the environment
a bit so application portability issues are more easily found.

With sys.stdin, you have a similar issue with CGI whereby you don't
want a WSGI application reading from it directly. Thus sys.stdin
should probably also be replaced with a file like object that always
returns EOF (empty string). Having sys.stdin do anything meaningful in
a multiple process server system like Apache also doesn't make sense,
although in the case of Apache it already ensures that stdin returns
EOF.

The tricky one is single process servers (which don't use sys.stdin
like CGI), as people may want to use interactive debuggers such as
pdb, although where a single process is actually multithreaded it
could preclude that to a degree unless you can stop two interactive
debuggers sessions being triggered at the same time. In Apache even if
one configures it to use only one child process this will still not
work. To get Apache to allow you to use pdb you have to run up httpd
direct with -DONE_PROCESS option.

Anyway, it may seem good practice for a WSGI adapter to still prevent
use of sys.stdin unless configured explicitly to allow it and even
then it might only allow it if the server is running in a mode whereby
it would work.

Finally, sys.stderr also presents problems of its own. Although
wsgi.errors is provided with the request environment, this can't be
used at global scope within a module when importing and also shouldn't
be used beyond the life time of the specific request. Thus, there
isn't a way to log stuff outside of a request and ensure it gets to
the server log. One could try and mandate use of 'logging' module, but
this isn't available in old versions of Python. Thus probably easier
to say that a WSGI adapter should always ensure that sys.stderr is
redirected to the server log. Only problem with this idea is that you
can potentially get interleaving of text when multithreading is being
used. What you need is for sys.stderr to be underlayed with thread
specific log objects each with its own buffering mechanism that
ensures that only complete lines of text get sent to the actual log
file. For log object associated with threads created to service a
request, easy enough to flush and cleanup such log object at the end
of the request, but what to do about user created threads as harder to
know when thread has finished and cleanup as necessary.

Yes one could simply ignore the whole issue, but I feel that a good
quality WSGI adapter/server should address these issues and either
lock things down as appropriate to protect users from themselves or
ensure that using them results in a sensible outcome.

Anyone who appreciates what I am talking here got any opinions of
their own about these issues?

Graham

From pywebsig at alan.kennedy.name  Thu Mar 22 12:29:27 2007
From: pywebsig at alan.kennedy.name (Alan Kennedy)
Date: Thu, 22 Mar 2007 11:29:27 +0000
Subject: [Web-SIG] Direct use of sys.stdout,
	sys.stderr and sys.stdin in WSGI application.
In-Reply-To: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
Message-ID: <4a951aa00703220429t7abaca96i643f7ac2284fbc9e@mail.gmail.com>

Graham,

I thought I'd reply, so that we'd get replies from everyone else to
tell me I'm wrong.

All your points are good common-sense stuff. I think that all of your
policies on stdin, stdout, and stderr are good, and are appropriate
for a WSGI environment running inside an Apache server.

Some small points.

> ..... one could actually write to sys.stdout directly as
> well since that is where the WSGI adapter writes it to anyway.

I think it's a good idea to redirect stdout, and to document in your
server/gateway documentation that you are doing so. I also think this
is a server specific issue.

> Anyway, it may seem good practice for a WSGI adapter to still prevent
> use of sys.stdin unless configured explicitly to allow it and even
> then it might only allow it if the server is running in a mode whereby
> it would work.

This should be a server-specific feature, that is documented.

> Finally, sys.stderr also presents problems of its own. Although
> wsgi.errors is provided with the request environment, this can't be
> used at global scope within a module when importing and also shouldn't
> be used beyond the life time of the specific request. Thus, there
> isn't a way to log stuff outside of a request and ensure it gets to
> the server log. One could try and mandate use of 'logging' module, but
> this isn't available in old versions of Python.

I don't think you need to worry about versions of python that don't
have the logging module. Strictly speaking, WSGI requires python 2.2,
because of iterators. So I think it's extremely unlikely that people
will be running WSGI apps on pre-2.2 VMs.

> What you need is for sys.stderr to be underlayed with thread
> specific log objects each with its own buffering mechanism that
> ensures that only complete lines of text get sent to the actual log
> file.

This is a server/gateway implementation detail.

> Yes one could simply ignore the whole issue, but I feel that a good
> quality WSGI adapter/server should address these issues and either
> lock things down as appropriate to protect users from themselves or
> ensure that using them results in a sensible outcome.

Given how much talk there is of the WSGI "environment", I think it's
good to raise these issues.

Regards,

Alan.

From pje at telecommunity.com  Thu Mar 22 16:30:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 22 Mar 2007 10:30:00 -0500
Subject: [Web-SIG] Direct use of sys.stdout,
 sys.stderr and  sys.stdin in WSGI application.
In-Reply-To: <4a951aa00703220429t7abaca96i643f7ac2284fbc9e@mail.gmail.co
 m>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
	<88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
Message-ID: <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com>

At 11:29 AM 3/22/2007 +0000, Alan Kennedy wrote:
>Strictly speaking, WSGI requires python 2.2,
>because of iterators.

Actually, it doesn't.  The pre-2.2 iterator protocol is to be used in such 
cases:

http://www.python.org/dev/peps/pep-0333/#supporting-older-2-2-versions-of-python


From ianb at colorstudy.com  Thu Mar 22 17:03:50 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 22 Mar 2007 11:03:50 -0500
Subject: [Web-SIG] Direct use of sys.stdout,
 sys.stderr and sys.stdin in WSGI application.
In-Reply-To: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
Message-ID: <4602A8E6.6080805@colorstudy.com>

Graham Dumpleton wrote:
> When one is using CGI as a means of implementing a WSGI application,
> although one would return content through the iterable returned from
> the application or by calling write() method returned from
> start_response(), one could actually write to sys.stdout directly as
> well since that is where the WSGI adapter writes it to anyway.
> 
> Obviously this isn't something that should be done but then the WSGI
> PEP doesn't say anything about code not writing to sys.stdout and more
> than likely at some point someone is going to think they can just use
> 'print' to have some debugging statements output where they think they
> will see them. In the case of CGI such output would wrongly end up in
> the response and screw things up.

Apparently I didn't ever fix up sys.stdout in my cgi-related code (I 
don't know if anyone actually uses it either), but I always intended to 
do so.  Particularly because the resulting bugs will be totally weird 
and hard to understand if people do print stuff.

I personally would capture stdout and put everything on stderr.

> To clarify this, a future update to WSGI specification or this
> environment specification people have been talking about, should
> perhaps clarify what behaviour one can expect out of sys.stdin,
> sys.stdout and sys.stderr.
> 
> In the case of sys.stdout, do people see it as being at least good
> practice, if not required by specification, that the WSGI adapter
> should ensure that sys.stdout cannot be written to directly or by
> using 'print' from a WSGI application. Thus, in a CGI adapter it would
> do something like:
> 
>   import sys
> 
>   class dummystdout:
>     def write(self, *args):
>       raise IOError("WSGI prohibits use of sys.stdout.")
>     ....
> 
>   def run_with_cgi(application):
>     ...
> 
>     stdout = sys.stdout
>     sys.stdout = dummystdout()
> 
>     ...
> 
>     def write(data):
>       ...
>       stdout.write(data)
>       stdout.flush()
> 
> In other words, it saves a reference to sys.stdout for its own use and
> then replaces sys.stdout with a dummy file like object that raises an
> exception if written to in any way or flushed.

As an avid use of "print" for debugging, this would bug me.  I would 
prefer just avoiding the CGI case where stdout goes to the client, and 
otherwise saying that the server should try to put stdout output 
someplace where it can be read.  But it could very well be a console, 
not necessarily a log file.  Or the same log file as stderr, or... 
something.

> With sys.stdin, you have a similar issue with CGI whereby you don't
> want a WSGI application reading from it directly. Thus sys.stdin
> should probably also be replaced with a file like object that always
> returns EOF (empty string). Having sys.stdin do anything meaningful in
> a multiple process server system like Apache also doesn't make sense,
> although in the case of Apache it already ensures that stdin returns
> EOF.

Yes, I don't see any real utility to sys.stdin, except potential confusion.

> The tricky one is single process servers (which don't use sys.stdin
> like CGI), as people may want to use interactive debuggers such as
> pdb, although where a single process is actually multithreaded it
> could preclude that to a degree unless you can stop two interactive
> debuggers sessions being triggered at the same time. In Apache even if
> one configures it to use only one child process this will still not
> work. To get Apache to allow you to use pdb you have to run up httpd
> direct with -DONE_PROCESS option.

Well... that's all true.  So I think this can be left up to the server. 
  Any CGI server should protect the user from unintentional bypassing 
the server.  Otherwise using sys.stdin probably implies some intention 
that we don't really need to get in the way of.

> Finally, sys.stderr also presents problems of its own. Although
> wsgi.errors is provided with the request environment, this can't be
> used at global scope within a module when importing and also shouldn't
> be used beyond the life time of the specific request. Thus, there
> isn't a way to log stuff outside of a request and ensure it gets to
> the server log. One could try and mandate use of 'logging' module, but
> this isn't available in old versions of Python. Thus probably easier
> to say that a WSGI adapter should always ensure that sys.stderr is
> redirected to the server log. Only problem with this idea is that you
> can potentially get interleaving of text when multithreading is being
> used. What you need is for sys.stderr to be underlayed with thread
> specific log objects each with its own buffering mechanism that
> ensures that only complete lines of text get sent to the actual log
> file. For log object associated with threads created to service a
> request, easy enough to flush and cleanup such log object at the end
> of the request, but what to do about user created threads as harder to
> know when thread has finished and cleanup as necessary.

I think sys.stderr and sys.stdout are fairly similar.  wsgi.stderr 
*could* be improved over a simple stream (e.g., you could cache stuff 
written to it, and write it in one chunk that is all the errors for the 
request).  But you could also just create some middleware that does 
that, writing to the server logs.

> Yes one could simply ignore the whole issue, but I feel that a good
> quality WSGI adapter/server should address these issues and either
> lock things down as appropriate to protect users from themselves or
> ensure that using them results in a sensible outcome.
> 
> Anyone who appreciates what I am talking here got any opinions of
> their own about these issues?

I guess in practice this hasn't been a problem for me.  In a CGI context 
these things certainly should be resolved because of the overlap.  But 
very few people use a CGI server, so it doesn't seem to come up often.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From pywebsig at alan.kennedy.name  Thu Mar 22 17:52:01 2007
From: pywebsig at alan.kennedy.name (Alan Kennedy)
Date: Thu, 22 Mar 2007 16:52:01 +0000
Subject: [Web-SIG] Direct use of sys.stdout,
	sys.stderr and sys.stdin in WSGI application.
In-Reply-To: <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
	<5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com>
Message-ID: <4a951aa00703220952k6215b122vdd247b01c2e651cb@mail.gmail.com>

[Alan Kennedy]
>>Strictly speaking, WSGI requires python 2.2,
>>because of iterators.

[Phillip J. Eby]
> Actually, it doesn't.  The pre-2.2 iterator protocol is to be used in such
> cases:
>
> http://www.python.org/dev/peps/pep-0333/#supporting-older-2-2-versions-of-python

Dang! I knew I couldn't say anything on web-sig without being contradicted ;-)

I am familiar with that section. I'm sure you remember writing this in
the credits section: "Alan Kennedy, whose courageous attempts to
implement WSGI-on-Jython (well before the spec was finalized) helped
to shape the "supporting older versions of Python" section".

But if the users want their "modern" python applications to be
portable everywhere on WSGI, e.g. returning (iterable) files as ouput,
or generators, then they should really stick with 2.2+.

But you are, of course, right about the pre-2.2 iterator protocol. I
wrote modjy for jython 2.1 according to the PEP guidelines, and have
had user reports that it works without modification on jython 2.2+.

Regards,

Alan.

From pje at telecommunity.com  Thu Mar 22 20:45:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 22 Mar 2007 14:45:53 -0500
Subject: [Web-SIG] Direct use of sys.stdout,
 sys.stderr and  sys.stdin in WSGI application.
In-Reply-To: <4a951aa00703220952k6215b122vdd247b01c2e651cb@mail.gmail.co
 m>
References: <5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com>
	<88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
	<5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070322144429.02c5b8b8@sparrow.telecommunity.com>

At 04:52 PM 3/22/2007 +0000, Alan Kennedy wrote:
>But if the users want their "modern" python applications to be
>portable everywhere on WSGI, e.g. returning (iterable) files as ouput,

Actually, returning a file as output is a horrible idea, since it will 
massively reduce throughput, due to transmitting one line at a time to the 
web browser.  :)


From graham.dumpleton at gmail.com  Thu Mar 22 22:03:04 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 23 Mar 2007 08:03:04 +1100
Subject: [Web-SIG] Direct use of sys.stdout,
	sys.stderr and sys.stdin in WSGI application.
In-Reply-To: <4602A8E6.6080805@colorstudy.com>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
	<4602A8E6.6080805@colorstudy.com>
Message-ID: <88e286470703221403s4ef424a8q14ac2ffb4ff74d36@mail.gmail.com>

Thanks for all the input, gives me some things to think about.

On 23/03/07, Ian Bicking <ianb at colorstudy.com> wrote:
> Graham Dumpleton wrote:
> > In the case of sys.stdout, do people see it as being at least good
> > practice, if not required by specification, that the WSGI adapter
> > should ensure that sys.stdout cannot be written to directly or by
> > using 'print' from a WSGI application. Thus, in a CGI adapter it would
> > do something like:
> >
> >   import sys
> >
> >   class dummystdout:
> >     def write(self, *args):
> >       raise IOError("WSGI prohibits use of sys.stdout.")
> >     ....
> >
> >   def run_with_cgi(application):
> >     ...
> >
> >     stdout = sys.stdout
> >     sys.stdout = dummystdout()
> >
> >     ...
> >
> >     def write(data):
> >       ...
> >       stdout.write(data)
> >       stdout.flush()
> >
> > In other words, it saves a reference to sys.stdout for its own use and
> > then replaces sys.stdout with a dummy file like object that raises an
> > exception if written to in any way or flushed.
>
> As an avid use of "print" for debugging, this would bug me.  I would
> prefer just avoiding the CGI case where stdout goes to the client, and
> otherwise saying that the server should try to put stdout output
> someplace where it can be read.  But it could very well be a console,
> not necessarily a log file.  Or the same log file as stderr, or...
> something.

Although using 'print' is handy. The reason I  was making sys.stdout
off limits and not just merging the output with sys.stderr, is that at
least one Python web framework hijacks sys.stdout for their own
purposes so that people can use 'print' to generate the actual content
of the response. The package that does this is web.py
(http://webpy.org/). Not sure if there are others which do this.

Graham

From ianb at colorstudy.com  Thu Mar 22 22:06:56 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 22 Mar 2007 16:06:56 -0500
Subject: [Web-SIG] Direct use of sys.stdout,
 sys.stderr and sys.stdin in WSGI application.
In-Reply-To: <88e286470703221403s4ef424a8q14ac2ffb4ff74d36@mail.gmail.com>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>	
	<4602A8E6.6080805@colorstudy.com>
	<88e286470703221403s4ef424a8q14ac2ffb4ff74d36@mail.gmail.com>
Message-ID: <4602EFF0.1040201@colorstudy.com>

Graham Dumpleton wrote:
>> As an avid use of "print" for debugging, this would bug me.  I would
>> prefer just avoiding the CGI case where stdout goes to the client, and
>> otherwise saying that the server should try to put stdout output
>> someplace where it can be read.  But it could very well be a console,
>> not necessarily a log file.  Or the same log file as stderr, or...
>> something.
> 
> Although using 'print' is handy. The reason I  was making sys.stdout
> off limits and not just merging the output with sys.stderr, is that at
> least one Python web framework hijacks sys.stdout for their own
> purposes so that people can use 'print' to generate the actual content
> of the response. The package that does this is web.py
> (http://webpy.org/). Not sure if there are others which do this.

I don't know of any others.  As a debugging tool I'm not as concerned, 
as if a web.py user used something I wrote I would have hopefully 
removed all prints -- if I hadn't, it would be a bug (not an uncommon 
bug, but a bug).  And the web.py user just won't do this, because 
they'll instantly break their app.

Paste also has something that will capture prints/sys.stdout and put it 
into the page that is served up (paste.debug.prints).  That middleware 
strategy would probably work regardless of what the server does.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From graham.dumpleton at gmail.com  Thu Mar 22 22:11:26 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 23 Mar 2007 08:11:26 +1100
Subject: [Web-SIG] Direct use of sys.stdout,
	sys.stderr and sys.stdin in WSGI application.
In-Reply-To: <5.1.1.6.0.20070322144429.02c5b8b8@sparrow.telecommunity.com>
References: <88e286470703210336h2d780432tf9f28f75e8366bfa@mail.gmail.com>
	<5.1.1.6.0.20070322102826.02d12230@sparrow.telecommunity.com>
	<5.1.1.6.0.20070322144429.02c5b8b8@sparrow.telecommunity.com>
Message-ID: <88e286470703221411w28a2434et46fabed2eca810c6@mail.gmail.com>

On 23/03/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:52 PM 3/22/2007 +0000, Alan Kennedy wrote:
> >But if the users want their "modern" python applications to be
> >portable everywhere on WSGI, e.g. returning (iterable) files as ouput,
>
> Actually, returning a file as output is a horrible idea, since it will
> massively reduce throughput, due to transmitting one line at a time to the
> web browser.  :)

FWIW, in mod_wsgi I have a directive which allows one to optionally
override the prescribed WSGI behaviour of flushing after every chunk
returned. Instead, the data gets buffered up by Apache and written as
a large block rather than small blocks. Obviously you cant use this if
you intend streaming data and probably not a good idea if something is
returning a huge amount of data, but added it if for some reason you
are using some third party WSGI component which is written in a sloppy
way and generates lots of small blocks and you cant change it easily
or quickly. With minimal effort the directive allows you to quickly
improve throughput while you perhaps address the issues in the WSGI
component or add on top your own middleware component which does the
buffering in some other way which suits the actual application better.

Graham

From graham.dumpleton at gmail.com  Fri Mar 30 00:09:49 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 30 Mar 2007 08:09:49 +1000
Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no
	argument.
In-Reply-To: <435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local>
References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local>
Message-ID: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>

Have cc'd this other to the web-sig list in case anyone wants to shoot
me down. :-)

On 30/03/07, Robert Brewer <fumanchu at amor.org> wrote:
> > Robert, was doing some testing with CherryPy WSGI server and noted
> > that if read() is called with no arguments on wsgi.input that it just
> > seems to hang indefinitely. Is there a problem here or have I managed
> > to stuff up very simple test. It works okay when I explicitly specific
> > content length.
>
> That's right. We simply hand the (blocking, makefiled) socket to the app
> as wsgi.input. PEP 333 says:
>
>     "The server is not required to read past the client's
>     specified Content-Length, and is allowed to simulate
>     an end-of-file condition if the application attempts
>     to read past that point. The application should not
>     attempt to read more data than is specified by the
>     CONTENT_LENGTH variable."
>
> We chose to not simulate the EOF, requiring app authors do that for
> themselves (mostly to give apps more flexibility). Note that the app
> side of CherryPy handles this for you by default. But since the spec
> clearly places the responsibility or checking content-length on the
> application side, it seemed redundant to perform the check both on the
> app side and the server side.

As I believe I have pointed out on the Python web-sig list before, the
statement:

""The application should not attempt to read more data than is
specified by the CONTENT_LENGTH variable."""

is actually a bit bogus.

This is because a WSGI middleware component or web server could be
acting as an input filter and decompressing a content encoding of gzip
for request. Since it knows the size will change but will not know
what the new size would be, except by buffering it all, it by rights
should remove CONTENT_LENGTH. This presents a problem for an
application as no CONTENT_LENGTH then to rely on to know whether it
has read to much input. If you leave CONTENT_LENGTH intact, it may
think it has read everything when there is in fact more.

Also, with chunked transfer encoding you will not have CONTENT_LENGTH
either. I know you read it all in and buffer it so you can calculate
it, but that prevents streaming with chunked encoding where content
length may be based on a series of end to communications.

Thus, an application should really be just ignoring CONTENT_LENGTH and
just successively calling read() in some way until it returns an empty
string. It can't really work reliably in any other way. I believe that
the WSGI adapter should be required (not just allowed) to simulate EOF
if it believes that no more input is available for that request. For
example, it knows at low level that CONTENT_LENGTH was valid because
no filtering by that point, or that in chunked encoding that null
block has been sent. The adapter is the only place it will generally
know that this is the case.

The only time that CONTENT_LENGTH may be of interest to an application
is if it is acting as a proxy to downstream web server as then it
needs to put it in downstream request. If no CONTENT_LENGTH or chunked
transfer encoding it would be forced to use chunked encoding for
downstream request.

FWIW, what I have come to the conclusion of is that read() with no
arguments is used then rather than say attempt to read all input in in
one go based on some content length, is that underneath the adapter
should insert its own size argument transparently. This size would be
based on some block size deemed to perhaps give best performance based
on technology being used. Thus read() with no arguments would always
return potentially partial data and not all data.

This is valid because semantics of read() for a file like object is
that one should call it until it returns an empty string as EOF
indicator. WSGI PEP is ambiguous in that respect as it says it is a
file like object but then says you aren't supposed to read more than
CONTENT_LENGTH and that an adapter doesn't have to simulate to EOF.
One may say that this overrides file like object properties, but the
WSGI way will not work all the time.

Graham

From foom at fuhm.net  Fri Mar 30 00:52:41 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 29 Mar 2007 18:52:41 -0400
Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no
	argument.
In-Reply-To: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local>
	<88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
Message-ID: <F8F9093A-E957-4567-BA4A-6807EEAB1254@fuhm.net>


On Mar 29, 2007, at 6:09 PM, Graham Dumpleton wrote:
> On 30/03/07, Robert Brewer <fumanchu at amor.org> wrote:
>
>> We chose to not simulate the EOF, requiring app authors do that for
>> themselves

CherryPy's deveopers are correct: they are following the WSGI spec.  
It is your app that is broken.

> As I believe I have pointed out on the Python web-sig list before, the
> statement:
>
> ""The application should not attempt to read more data than is
> specified by the CONTENT_LENGTH variable."""
>
> is actually a bit bogus.

This requirement comes from CGI. CGI scripts cannot support unknown  
data lengths (yes, this means no chunked transfer). CONTENT_LENGTH is  
required to be provided if there is data, and the server is not  
required to provide an EOF after reading CONTENT_LENGTH bytes. WSGI  
inherits the same restrictions.

I do agree with you that this was a mistake. WSGI should require WSGI  
servers/gateway to provide an EOF for read(), always, and should make  
a break from CGI and declare that CONTENT_LENGTH=0 means no data and  
CONTENT_LENGTH empty/missing means undefined length. This is  
something which ought to be fixed for the next revision of WSGI. This  
makes it a tiny bit harder to write a CGI gateway, of course, but  
it's worth it in my opinion, for the reasons you describe.

HOWEVER, given that the current WSGI spec does not specify that, apps  
*cannot* depend upon that behavior. If your app does an unbounded read 
(), it's wrong. And, by reference to the CGI spec, if a server omits  
CONTENT_LENGTH, and there is data, it is wrong. The server ought to  
return a 411 Length Required if you attempt to access a WSGI app and  
provide chunked data.

And, indeed, server code I wrote is wrong in just this way: it can  
omit CONTENT_LENGTH when given chunked data on input. Spec-compliant  
WSGI apps would then assume there's no input data which will then  
cause data loss. Luckily nobody ever passes chunked data on input. :)

James

PS: what about the readline(size) problem? Are we just going to  
continue indefinitely pretending that it's okay that the spec forbids  
using readline(size) and that cgi.FieldStorage calls it? Perhaps a  
WSGI 1.1 fixing these issues would be a good idea? 

From graham.dumpleton at gmail.com  Fri Mar 30 01:59:05 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 30 Mar 2007 09:59:05 +1000
Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no
	argument.
In-Reply-To: <F8F9093A-E957-4567-BA4A-6807EEAB1254@fuhm.net>
References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local>
	<88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
	<F8F9093A-E957-4567-BA4A-6807EEAB1254@fuhm.net>
Message-ID: <88e286470703291659h2b933a35k5ca844b8f3d78eb@mail.gmail.com>

On 30/03/07, James Y Knight <foom at fuhm.net> wrote:
>
> On Mar 29, 2007, at 6:09 PM, Graham Dumpleton wrote:
> > On 30/03/07, Robert Brewer <fumanchu at amor.org> wrote:
> >
> >> We chose to not simulate the EOF, requiring app authors do that for
> >> themselves
>
> CherryPy's deveopers are correct: they are following the WSGI spec.
> It is your app that is broken.

Since my app is a ten line test program just to test what the CherryPy
WSGI server does, I am not too concerned. :-)

> This requirement comes from CGI. CGI scripts cannot support unknown
> data lengths (yes, this means no chunked transfer). CONTENT_LENGTH is
> required to be provided if there is data, and the server is not
> required to provide an EOF after reading CONTENT_LENGTH bytes. WSGI
> inherits the same restrictions.
>
> I do agree with you that this was a mistake. WSGI should require WSGI
> servers/gateway to provide an EOF for read(), always, and should make
> a break from CGI and declare that CONTENT_LENGTH=0 means no data and
> CONTENT_LENGTH empty/missing means undefined length. This is
> something which ought to be fixed for the next revision of WSGI. This
> makes it a tiny bit harder to write a CGI gateway, of course, but
> it's worth it in my opinion, for the reasons you describe.
>
> HOWEVER, given that the current WSGI spec does not specify that, apps
> *cannot* depend upon that behavior. If your app does an unbounded read
> (), it's wrong. And, by reference to the CGI spec, if a server omits
> CONTENT_LENGTH, and there is data, it is wrong. The server ought to
> return a 411 Length Required if you attempt to access a WSGI app and
> provide chunked data.
>
> And, indeed, server code I wrote is wrong in just this way: it can
> omit CONTENT_LENGTH when given chunked data on input. Spec-compliant
> WSGI apps would then assume there's no input data which will then
> cause data loss. Luckily nobody ever passes chunked data on input. :)
>
> James
>
> PS: what about the readline(size) problem? Are we just going to
> continue indefinitely pretending that it's okay that the spec forbids
> using readline(size) and that cgi.FieldStorage calls it? Perhaps a
> WSGI 1.1 fixing these issues would be a good idea?

At least we agree on the problems with the WSGI specification.

My problem now is that in mod_wsgi do I implement it exactly as per
the WSGI 1.0 specification and thus propagate these problems and
limitations (and thereby block use of cgi.FieldStorage), or if we can
get some forward looking consensus on what WGSI 1.1 should do,
implement to that instead.

I would rather address the problems now as in the Apache world, once
an Apache module gets installed, especially by a web hosting provider,
it stays at that version for ages. On the mod_python list we still
have to deal with people using older versions of mod_python
2.7/3.0/3.1 which are many years old even though we are up to
mod_python 3.3 now.

I could also just implement what makes the most sense even if people
don't want to agree on a general consensus that that is what WSGI 1.1
should do. As far as I can see so far, this would still be WSGI 1.0
compliant, but what is the point if a WSGI 1.0 compliant application
can't make use of it and whereby WSGI 1.1 may never come out or be
different anyway.

Graham

From ianb at colorstudy.com  Fri Mar 30 02:19:44 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 29 Mar 2007 19:19:44 -0500
Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no
 argument.
In-Reply-To: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
References: <88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com>	<435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local>
	<88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
Message-ID: <460C57A0.9080506@colorstudy.com>

Graham Dumpleton wrote:
> ""The application should not attempt to read more data than is
> specified by the CONTENT_LENGTH variable."""
> 
> is actually a bit bogus.
> 
> This is because a WSGI middleware component or web server could be
> acting as an input filter and decompressing a content encoding of gzip
> for request. Since it knows the size will change but will not know
> what the new size would be, except by buffering it all, it by rights
> should remove CONTENT_LENGTH. This presents a problem for an
> application as no CONTENT_LENGTH then to rely on to know whether it
> has read to much input. If you leave CONTENT_LENGTH intact, it may
> think it has read everything when there is in fact more.

I thought leaving it out might be a good way to indicate 
content-length-unknown, but now I'm not so sure.  I think a better 
indication is "-1", which works with cgi.FieldStorage and lots of other 
code, and generally .read(-1) means "give me everything you have".


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From pje at telecommunity.com  Fri Mar 30 02:30:37 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 29 Mar 2007 19:30:37 -0500
Subject: [Web-SIG] CherryPy WSGI server and wsgi.input.read() with no
 argument.
In-Reply-To: <F8F9093A-E957-4567-BA4A-6807EEAB1254@fuhm.net>
References: <88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
	<88e286470703290348j68b0a333qb6e9935b610fa494@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A860A96FFB8@ex9.hostedexchange.local>
	<88e286470703291509r2702fd0cp2fe08f40b72624ae@mail.gmail.com>
Message-ID: <5.1.1.6.0.20070329191711.04129658@sparrow.telecommunity.com>

At 06:52 PM 3/29/2007 -0400, James Y Knight wrote:
>Perhaps a WSGI 1.1 fixing these issues would be a good idea?

I would personally rather see a WSGI 2.0 that also gets rid of 
start_response(), write(), and perhaps adds better async support.

I suspect that the current approach to using yield boundaries to indicate 
buffer flushing should be replaced with yielding an explicit flush request 
object.  WSGI beginners seem to think that write() and yield are like 
"print" in CGI, and thus end up writing code that performs crappily on 
compliant servers.  In retrospect, the "server push" use case is much less 
common and it's reasonable to have to do something explicit to support 
it.  Middleware would also be happier if it could tell when the application 
really wanted to flush the output.

Combining this with some way to yield "pauses" to better support async 
servers would be ideal.  It would also be nice if you could cleanly adapt 
WSGI 1.0 to 2.0 and vice versa, as long as you're using a reasonable subset 
(i.e. a subset that doesn't care about some of the quirks we need to fix).


From ianb at colorstudy.com  Fri Mar 30 02:56:36 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 29 Mar 2007 19:56:36 -0500
Subject: [Web-SIG] WSGI 2.0
Message-ID: <460C6044.2090602@colorstudy.com>

Do we want to discuss WSGI 2.0?  I added a wiki page here to list 
anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0

I've listed the things I can remember, and copying here:


start_response and write
------------------------

We could remove ``start_response`` and the writer that it implies.  This 
would lead to a signature like::

     def app(environ):
         return '200 OK', [('Content-type', 'text/plain')], ['Hello world']

That is, return a three-tuple of (status, headers, app_iter).

It's relatively simple to provide adapters to and from this signature to 
the WSGI 1.0 signature.

Optional keys (removing)
------------------------

Several keys are optional in WSGI, but required in CGI, in particular 
``SCRIPT_NAME``, ``PATH_INFO`` and ``QUERY_STRING``.  Also 
``REMOTE_ADDR`` and ``SERVER_SOFTWARE`` are supposed to exist.

Unknown-length wsgi.input
-------------------------

There's no documented way to indicate that there *is* content in 
``environ['wsgi.input']``, but the content length is unknown.  A value 
of ``"-1"`` may work in many situations.  A missing ``CONTENT_LENGTH`` 
doesn't generally work currently (it's assumed to mean 0 by much code).

readline(size)
--------------

Currently the specification does not require servers to provide 
``environ['wsgi.input'].readline(size)`` (the size argument in 
particular).  But ``cgi.FieldStorage`` calls readline this way, so in 
effect it is required.

app_iter and threads
--------------------

It's not clear if the app_iter must be used in the same thread as the 
application.  Since the application is blocking, presumably *it* must be 
run all in one thread.  This should be more explicitly documented.


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From graham.dumpleton at gmail.com  Fri Mar 30 03:10:17 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 30 Mar 2007 11:10:17 +1000
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <460C6044.2090602@colorstudy.com>
References: <460C6044.2090602@colorstudy.com>
Message-ID: <88e286470703291810n56593d3bq98ad7717f06b520e@mail.gmail.com>

On 30/03/07, Ian Bicking <ianb at colorstudy.com> wrote:
> Do we want to discuss WSGI 2.0?  I added a wiki page here to list
> anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0
>
> I've listed the things I can remember, and copying here:
>
> ...
>
> Optional keys (removing)
> ------------------------
>
> Several keys are optional in WSGI, but required in CGI, in particular
> ``SCRIPT_NAME``, ``PATH_INFO`` and ``QUERY_STRING``.  Also
> ``REMOTE_ADDR`` and ``SERVER_SOFTWARE`` are supposed to exist.

Huh. Where does it say that SCRIPT_NAME can be optional in WSGI. I
know it can be empty if mount point is the root of the web server, but
that it can not be there at all is new to me.

One other issue if aiming at supporting chunked encoding for a
request, is how (if one even can) make available the trailing headers
if present after the final null data block. Personally I am not sure
this one is worth the trouble and may be quite hard to even implement
with some web servers as they don't even provide them as a separate
set of headers but simply merge them on top of the main request
headers.

Graham

From foom at fuhm.net  Fri Mar 30 03:35:22 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 29 Mar 2007 21:35:22 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <460C6044.2090602@colorstudy.com>
References: <460C6044.2090602@colorstudy.com>
Message-ID: <F70C82D0-9CA2-41D6-88CB-DBE1DA98E9A7@fuhm.net>

On Mar 29, 2007, at 8:56 PM, Ian Bicking wrote:
> readline(size)
> --------------
>
> Currently the specification does not require servers to provide
> ``environ['wsgi.input'].readline(size)`` (the size argument in
> particular).  But ``cgi.FieldStorage`` calls readline this way, so in
> effect it is required.

I actually think a minor revision to WSGI should be issued  
immediately, the only change being that readline(size) is required to  
be implemented by servers/gateways, and bumping the rev number to  
1.1. Leaving the spec as it is is basically a lie. You cannot  
implement a WSGI server now, without implementing readline(size) and  
expect apps to work. Adding this is a completely backwards compatible  
change, and is probably already implemented in most (all?) servers,  
so it shouldn't be controversial.

James

From pje at telecommunity.com  Fri Mar 30 04:41:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 29 Mar 2007 21:41:12 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <460C6044.2090602@colorstudy.com>
Message-ID: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>

At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote:
>Do we want to discuss WSGI 2.0?  I added a wiki page here to list
>anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0
>
>I've listed the things I can remember, and copying here:
>
>
>start_response and write
>------------------------
>
>We could remove ``start_response`` and the writer that it implies.  This
>would lead to a signature like::
>
>      def app(environ):
>          return '200 OK', [('Content-type', 'text/plain')], ['Hello world']
>
>That is, return a three-tuple of (status, headers, app_iter).
>
>It's relatively simple to provide adapters to and from this signature to
>the WSGI 1.0 signature.

I think we also want to have a value you can yield from the app_iter to 
explicitly request that the buffer be flushed, and that we should reopen 
the discussion about values to be yielded to communicate with async 
servers, indicating that the iterator should be paused pending input or 
some other operation.

Ideally, this should be done in a way that's easy for middleware to handle; 
a flush signal should be handled by the middleware *and* passed up the 
chain, while any other async signals would be passed directly up the chain 
(unless it's something like "pause for input" and the middleware controls 
the input).

If we do this right, it should be easier to write middleware that works 
correctly with respect to buffering, since the issues of flushing and 
pausing now become explicit rather than implicit.  (This should make it 
easier to teach/learn as well.)


>It's not clear if the app_iter must be used in the same thread as the
>application.  Since the application is blocking, presumably *it* must be
>run all in one thread.  This should be more explicitly documented.

Definitely.  I think that we should not require thread affinity between the 
application and the app_iter -- my feeling at this point is that actual 
yielding is an edge case with respect to most WSGI apps.  The common case 
WSGI application should be just returning a list or tuple with a single 
string in it, and not doing any complex iteration.  Allowing the server 
more flexibility here is probably the better choice.

Indeed, I'm not sure we should require thread affinity across invocations 
of app_iter.next().


From foom at fuhm.net  Fri Mar 30 05:08:39 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 29 Mar 2007 23:08:39 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
Message-ID: <B6AB7C50-9799-4A39-9AEC-6A79CA946CD3@fuhm.net>


On Mar 29, 2007, at 10:41 PM, Phillip J. Eby wrote:

>> It's not clear if the app_iter must be used in the same thread as the
>> application.  Since the application is blocking, presumably *it*  
>> must be
>> run all in one thread.  This should be more explicitly documented.
>
> Definitely.  I think that we should not require thread affinity  
> between the
> application and the app_iter -- my feeling at this point is that  
> actual
> yielding is an edge case with respect to most WSGI apps.  The  
> common case
> WSGI application should be just returning a list or tuple with a  
> single
> string in it, and not doing any complex iteration.  Allowing the  
> server
> more flexibility here is probably the better choice.
>
> Indeed, I'm not sure we should require thread affinity across  
> invocations
> of app_iter.next().

I recall last time this issue was considered, one of the fundamental  
problems is that, if the same thread isn't used for both the app and  
all app_iter.next invocations, sqlite cannot be used. (unless you  
don't call sqlite functions in the iterate part, of course). And I'm  
sure there's other libraries that are similarly thread-safe but only  
if you restrict yourself to a single thread per handle.

That problem made me uncomfortable enough with using non-dedicated  
threads that I didn't attempt it. If WSGI 2.0 explicitly states that  
each call to the app's iterator can occur on a different thread, then  
I'd be more confident in telling people that it's their code that was  
broken. I suppose another flag could be added "wsgi.dedicated_thread"  
which is True only if every call to .next will be on the same thread  
as the call to your app. Of course that doesn't really help an app  
broken by it, just lets them error out early.

James

From ianb at colorstudy.com  Fri Mar 30 06:11:33 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 29 Mar 2007 23:11:33 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <B6AB7C50-9799-4A39-9AEC-6A79CA946CD3@fuhm.net>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<B6AB7C50-9799-4A39-9AEC-6A79CA946CD3@fuhm.net>
Message-ID: <460C8DF5.20601@colorstudy.com>

James Y Knight wrote:
> 
> On Mar 29, 2007, at 10:41 PM, Phillip J. Eby wrote:
> 
>>> It's not clear if the app_iter must be used in the same thread as the
>>> application.  Since the application is blocking, presumably *it* must be
>>> run all in one thread.  This should be more explicitly documented.
>>
>> Definitely.  I think that we should not require thread affinity 
>> between the
>> application and the app_iter -- my feeling at this point is that actual
>> yielding is an edge case with respect to most WSGI apps.  The common case
>> WSGI application should be just returning a list or tuple with a single
>> string in it, and not doing any complex iteration.  Allowing the server
>> more flexibility here is probably the better choice.
>>
>> Indeed, I'm not sure we should require thread affinity across invocations
>> of app_iter.next().
> 
> I recall last time this issue was considered, one of the fundamental 
> problems is that, if the same thread isn't used for both the app and all 
> app_iter.next invocations, sqlite cannot be used. (unless you don't call 
> sqlite functions in the iterate part, of course). And I'm sure there's 
> other libraries that are similarly thread-safe but only if you restrict 
> yourself to a single thread per handle.

This aspect of SQLite totally sucks.  But I haven't encountered any 
other libraries with the same restrictions.  I might just not notice -- 
quite possible -- but still, I haven't noticed it.  And of course 
pre-fetching the results solves the problem.  The advantages seem much 
more substantial than to make it worth it to cater to one stupid library.

At least it *seems* like there's an advantage, in that an async server 
could handle lots of slow-consuming clients (or large responses) without 
a whole lot of overhead, because it could deal with all the app_iter's 
in a single thread.  If that wouldn't work anyway, then it's no good, 
but I'm assuming that could work.

> That problem made me uncomfortable enough with using non-dedicated 
> threads that I didn't attempt it. If WSGI 2.0 explicitly states that 
> each call to the app's iterator can occur on a different thread, then 
> I'd be more confident in telling people that it's their code that was 
> broken. I suppose another flag could be added "wsgi.dedicated_thread" 
> which is True only if every call to .next will be on the same thread as 
> the call to your app. Of course that doesn't really help an app broken 
> by it, just lets them error out early.

That's essentially what wsgi.threaded and wsgi.multiprocess do.  I think 
it's a reasonable thing to give, because there is some potential that 
you'd get incorrect data instead of an exception if there really was 
problematic code.  And it would allow a SQLite user to at least call 
list() (or fetchall) on their app_iter.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From ianb at colorstudy.com  Fri Mar 30 06:16:07 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 29 Mar 2007 23:16:07 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <88e286470703291810n56593d3bq98ad7717f06b520e@mail.gmail.com>
References: <460C6044.2090602@colorstudy.com>
	<88e286470703291810n56593d3bq98ad7717f06b520e@mail.gmail.com>
Message-ID: <460C8F07.7000400@colorstudy.com>

Graham Dumpleton wrote:
> On 30/03/07, Ian Bicking <ianb at colorstudy.com> wrote:
>> Do we want to discuss WSGI 2.0?  I added a wiki page here to list
>> anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0
>>
>> I've listed the things I can remember, and copying here:
>>
>> ...
>>
>> Optional keys (removing)
>> ------------------------
>>
>> Several keys are optional in WSGI, but required in CGI, in particular
>> ``SCRIPT_NAME``, ``PATH_INFO`` and ``QUERY_STRING``.  Also
>> ``REMOTE_ADDR`` and ``SERVER_SOFTWARE`` are supposed to exist.
> 
> Huh. Where does it say that SCRIPT_NAME can be optional in WSGI. I
> know it can be empty if mount point is the root of the web server, but
> that it can not be there at all is new to me.

"The following variables must be present, unless their value would be an 
empty string, in which case they may be omitted, except as otherwise 
noted below."

It doesn't really say that SCRIPT_NAME and PATH_INFO are optional, but 
it doesn't clearly say they are not optional.  QUERY_STRING specifically 
is optional, but there's a bug in cgi.FieldStorage if you ever do omit 
it, so you really shouldn't.  And in the CGI spec QUERY_STRING is not 
optional.

I actually don't like REMOTE_ADDR being required, as sometimes it is not 
applicable.  For instance, if you are pre-requesting a resource or doing 
a totally internal request.  I could imagine putting a non-IP address 
there, but I think it would be better simply to omit the variable.

SERVER_SOFTWARE is mostly silly.

> One other issue if aiming at supporting chunked encoding for a
> request, is how (if one even can) make available the trailing headers
> if present after the final null data block. Personally I am not sure
> this one is worth the trouble and may be quite hard to even implement
> with some web servers as they don't even provide them as a separate
> set of headers but simply merge them on top of the main request
> headers.

Can you put this on the wiki?

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From ianb at colorstudy.com  Fri Mar 30 06:30:58 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 29 Mar 2007 23:30:58 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
Message-ID: <460C9282.9020507@colorstudy.com>

Phillip J. Eby wrote:
> At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote:
>> Do we want to discuss WSGI 2.0?  I added a wiki page here to list
>> anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0
>>
>> I've listed the things I can remember, and copying here:
>>
>>
>> start_response and write
>> ------------------------
>>
>> We could remove ``start_response`` and the writer that it implies.  This
>> would lead to a signature like::
>>
>>      def app(environ):
>>          return '200 OK', [('Content-type', 'text/plain')], ['Hello 
>> world']
>>
>> That is, return a three-tuple of (status, headers, app_iter).
>>
>> It's relatively simple to provide adapters to and from this signature to
>> the WSGI 1.0 signature.
> 
> I think we also want to have a value you can yield from the app_iter to 
> explicitly request that the buffer be flushed, and that we should reopen 
> the discussion about values to be yielded to communicate with async 
> servers, indicating that the iterator should be paused pending input or 
> some other operation.

(this should probably be opened as a separate item from the signature 
change, as I don't think it relates much to that)

I'd rather not introduce new objects, since we don't have any new 
objects yet.  None is an obvious object, but it's vague in this context. 
  To me it feels more like a pause than a flush.  Flush really means 
*do* something, and None feels like the no-op, which is more like a pause.

I've become interested in using WSGI middleware as an HTTP translating 
proxy, so the async opportunities are of more interest to me now.  In 
part just the app_iter non-thread-affinity change would be helpful, I 
think.  Dealing with large request bodies is harder, I think, because 
those would have to be processed before the WSGI app returned.  But 
that's less concerning to me.

It seems like if yielding None from an app_iter meant "put me at the 
back of the queue" that would be a fairly simple and effective way of 
handling async for large (or slow) response bodies.  This wouldn't 
really work for the Twisted stuff where you keep a response open and 
trickle out data based on server-side events (because you can't control 
when you get back to the beginning of the queue), but otherwise it seems 
pretty good.  I suppose full control could be allowed if you could do 
something like return an object that could be part of the event loop 
somehow.  If we had some standard async-wrapping-key of some sort, 
perhaps.  For example (I say with no real knowledge of Deferred):

environ['wsgi.async_callback'] = EventMatcher
# in the app:
yield environ['wsgi.async_callback'](some_event)
# in the server:
for item in app_iter:
     if isinstance(item, EventMatcher):
         # queue up the app_iter, leaving it paused until something
         # matching that event happens


I feel somehow that it could be useful for intermediaries to be able to 
filter out this callback, and so a documented key (or keys) would be 
good.  But I can't quite place why I'd want to do that.  Well, except 
that any intermediary would have to be able to detect this kind of 
object and pass it back up.  So maybe instead of filtering it out of the 
environ, there needs to be some easy test that can be applied.

What the event object looks like ("some_event"), I have no idea.

> Ideally, this should be done in a way that's easy for middleware to 
> handle; a flush signal should be handled by the middleware *and* passed 
> up the chain, while any other async signals would be passed directly up 
> the chain (unless it's something like "pause for input" and the 
> middleware controls the input).
> 
> If we do this right, it should be easier to write middleware that works 
> correctly with respect to buffering, since the issues of flushing and 
> pausing now become explicit rather than implicit.  (This should make it 
> easier to teach/learn as well.)

In terms of buffering, I can't think of many cases where it would 
matter.  Either the middleware passes back the response with no changes, 
or it needs to consume the entire response body (and probably headers 
and maybe status) to do whatever transformation it needs to do.

Things like pauses and async signals would ideally be passed upstream, 
but flushes and content would all be consumed by the middleware.

>> It's not clear if the app_iter must be used in the same thread as the
>> application.  Since the application is blocking, presumably *it* must be
>> run all in one thread.  This should be more explicitly documented.
> 
> Definitely.  I think that we should not require thread affinity between 
> the application and the app_iter -- my feeling at this point is that 
> actual yielding is an edge case with respect to most WSGI apps.  The 
> common case WSGI application should be just returning a list or tuple 
> with a single string in it, and not doing any complex iteration.  
> Allowing the server more flexibility here is probably the better choice.
> 
> Indeed, I'm not sure we should require thread affinity across 
> invocations of app_iter.next().

It seems unlikely there'd be a need to move it between threads, but then 
it doesn't seem like there's much need for the application to have it 
all called in one thread either (i.e., if you move threads once, moving 
threads again shouldn't be a problem).


-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From pje at telecommunity.com  Fri Mar 30 18:46:38 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 30 Mar 2007 11:46:38 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <B6AB7C50-9799-4A39-9AEC-6A79CA946CD3@fuhm.net>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070330112527.02c82c70@sparrow.telecommunity.com>

At 11:08 PM 3/29/2007 -0400, James Y Knight wrote:
>On Mar 29, 2007, at 10:41 PM, Phillip J. Eby wrote:
>>>It's not clear if the app_iter must be used in the same thread as the
>>>application.  Since the application is blocking, presumably *it*
>>>must be
>>>run all in one thread.  This should be more explicitly documented.
>>
>>Definitely.  I think that we should not require thread affinity
>>between the
>>application and the app_iter -- my feeling at this point is that
>>actual
>>yielding is an edge case with respect to most WSGI apps.  The
>>common case
>>WSGI application should be just returning a list or tuple with a
>>single
>>string in it, and not doing any complex iteration.  Allowing the
>>server
>>more flexibility here is probably the better choice.
>>
>>Indeed, I'm not sure we should require thread affinity across
>>invocations
>>of app_iter.next().
>
>I recall last time this issue was considered, one of the fundamental
>problems is that, if the same thread isn't used for both the app and
>all app_iter.next invocations, sqlite cannot be used. (unless you
>don't call sqlite functions in the iterate part, of course). And I'm
>sure there's other libraries that are similarly thread-safe but only
>if you restrict yourself to a single thread per handle.

Right -- but the point here is that you only need to *have* an iterator if 
you're doing server push or trying to stream large files.  I don't mind 
making these corner cases a bit tougher to implement, since they're fairly 
tough already.  If you're running a WSGI 1.0 app under a 2.0->1.0 adapter, 
you can always use an adapter that ensures thread affinity.  Indeed, any 
2.0->1.0 adapter that supports multiple write() calls is going to need to 
have some sort of threading mechanism anyway, unless it uses greenlets.


>That problem made me uncomfortable enough with using non-dedicated
>threads that I didn't attempt it. If WSGI 2.0 explicitly states that
>each call to the app's iterator can occur on a different thread, then
>I'd be more confident in telling people that it's their code that was
>broken. I suppose another flag could be added "wsgi.dedicated_thread"
>which is True only if every call to .next will be on the same thread
>as the call to your app. Of course that doesn't really help an app
>broken by it, just lets them error out early.

I'd like to have fewer optional things, rather than more, so I think we 
should either require a dedicated thread or make it non-dedicated.  It 
should be quite straightforward to implement a middleware component that 
ensures its wrappee is run entirely within a dedicated thread, using a Queue.


From pje at telecommunity.com  Fri Mar 30 19:06:23 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 30 Mar 2007 12:06:23 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <460C9282.9020507@colorstudy.com>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com>

At 11:30 PM 3/29/2007 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote:
>>>Do we want to discuss WSGI 2.0?  I added a wiki page here to list
>>>anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0
>>>
>>>I've listed the things I can remember, and copying here:
>>>
>>>
>>>start_response and write
>>>------------------------
>>>
>>>We could remove ``start_response`` and the writer that it implies.  This
>>>would lead to a signature like::
>>>
>>>      def app(environ):
>>>          return '200 OK', [('Content-type', 'text/plain')], ['Hello world']
>>>
>>>That is, return a three-tuple of (status, headers, app_iter).
>>>
>>>It's relatively simple to provide adapters to and from this signature to
>>>the WSGI 1.0 signature.
>>I think we also want to have a value you can yield from the app_iter to 
>>explicitly request that the buffer be flushed, and that we should reopen 
>>the discussion about values to be yielded to communicate with async 
>>servers, indicating that the iterator should be paused pending input or 
>>some other operation.
>
>(this should probably be opened as a separate item from the signature 
>change, as I don't think it relates much to that)
>
>I'd rather not introduce new objects, since we don't have any new objects 
>yet.  None is an obvious object, but it's vague in this context.  To me it 
>feels more like a pause than a flush.  Flush really means *do* something, 
>and None feels like the no-op, which is more like a pause.
>
>I've become interested in using WSGI middleware as an HTTP translating 
>proxy, so the async opportunities are of more interest to me now.  In part 
>just the app_iter non-thread-affinity change would be helpful, I 
>think.  Dealing with large request bodies is harder, I think, because 
>those would have to be processed before the WSGI app returned.  But that's 
>less concerning to me.
>
>It seems like if yielding None from an app_iter meant "put me at the back 
>of the queue" that would be a fairly simple and effective way of handling 
>async for large (or slow) response bodies.  This wouldn't really work for 
>the Twisted stuff where you keep a response open and trickle out data 
>based on server-side events (because you can't control when you get back 
>to the beginning of the queue), but otherwise it seems pretty good.  I 
>suppose full control could be allowed if you could do something like 
>return an object that could be part of the event loop somehow.  If we had 
>some standard async-wrapping-key of some sort, perhaps.  For example (I 
>say with no real knowledge of Deferred):
>
>environ['wsgi.async_callback'] = EventMatcher
># in the app:
>yield environ['wsgi.async_callback'](some_event)
># in the server:
>for item in app_iter:
>     if isinstance(item, EventMatcher):
>         # queue up the app_iter, leaving it paused until something
>         # matching that event happens

I was thinking of something a bit simpler; the environ key would be an 
object that, when called, tells the server that it's okay to resume 
iteration attempts on the application.  A sort of "put me back on the queue 
for iteration" call.  The callback would have to be safe to call from any 
thread at any time, and must not re-enter anything, just re-enable iteration.


>I feel somehow that it could be useful for intermediaries to be able to 
>filter out this callback, and so a documented key (or keys) would be 
>good.  But I can't quite place why I'd want to do that.  Well, except that 
>any intermediary would have to be able to detect this kind of object and 
>pass it back up.  So maybe instead of filtering it out of the environ, 
>there needs to be some easy test that can be applied.

My thought is that flow control could be done with tuples whose first 
element is a number, and whose other elements are arguments.

Why a number and not a string?  So that if you forget to make it a tuple, 
it won't be sent as part of the output stream; it'll be detected as an 
error.  Also, numbers are harder to assign and keep track of, and we want 
to have a very small set of strictly-defined flow control operations: pause 
(aka "nothing to report yet"), flush, and perhaps "wait for input".

Alternatively, we could just go with numbers and not worry about tuples at 
all.  I don't actually know of anything that needs an argument.


>>Ideally, this should be done in a way that's easy for middleware to 
>>handle; a flush signal should be handled by the middleware *and* passed 
>>up the chain, while any other async signals would be passed directly up 
>>the chain (unless it's something like "pause for input" and the 
>>middleware controls the input).
>>If we do this right, it should be easier to write middleware that works 
>>correctly with respect to buffering, since the issues of flushing and 
>>pausing now become explicit rather than implicit.  (This should make it 
>>easier to teach/learn as well.)
>
>In terms of buffering, I can't think of many cases where it would 
>matter.  Either the middleware passes back the response with no changes, 
>or it needs to consume the entire response body (and probably headers and 
>maybe status) to do whatever transformation it needs to do.
>
>Things like pauses and async signals would ideally be passed upstream, but 
>flushes and content would all be consumed by the middleware.

I can't think of any condition where middleware would *not* pass all of 
these up to its caller.  In the case of a "flush", it needs to first yield 
any buffered output, but it *must* still yield the flush.

For example, if you're doing server push, then the app should yield a flush 
prior to each new content boundary.  If the middleware is doing compression 
or some such, then it needs to restart encoding after each content 
boundary, as well as flush the prior encoded output.


>>>It's not clear if the app_iter must be used in the same thread as the
>>>application.  Since the application is blocking, presumably *it* must be
>>>run all in one thread.  This should be more explicitly documented.
>>Definitely.  I think that we should not require thread affinity between 
>>the application and the app_iter -- my feeling at this point is that 
>>actual yielding is an edge case with respect to most WSGI apps.  The 
>>common case WSGI application should be just returning a list or tuple 
>>with a single string in it, and not doing any complex iteration.
>>Allowing the server more flexibility here is probably the better choice.
>>Indeed, I'm not sure we should require thread affinity across invocations 
>>of app_iter.next().
>
>It seems unlikely there'd be a need to move it between threads,

In the case of Twisted, the easiest way to run possibly-blocking app code 
would be "deferToThread(app_iter.next)", and the code could end up running 
in any of several pooled threads, each time.  So, really, the nominal case 
for Twisted is the one where you'd want there to be no need for affinity 
across iterations.


>but then it doesn't seem like there's much need for the application to 
>have it all called in one thread either (i.e., if you move threads once, 
>moving threads again shouldn't be a problem).
>
>
>--
>Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
>             | Write code, do good | http://topp.openplans.org/careers


From foom at fuhm.net  Fri Mar 30 19:26:02 2007
From: foom at fuhm.net (James Y Knight)
Date: Fri, 30 Mar 2007 13:26:02 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <5.1.1.6.0.20070330112527.02c82c70@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070330112527.02c82c70@sparrow.telecommunity.com>
Message-ID: <047DEF82-CF27-4AA9-B611-0E8602E91C6D@fuhm.net>


On Mar 30, 2007, at 12:46 PM, Phillip J. Eby wrote:
>> I suppose another flag could be added "wsgi.dedicated_thread"
>> which is True only if every call to .next will be on the same thread
>> as the call to your app. Of course that doesn't really help an app
>> broken by it, just lets them error out early.
>
> I'd like to have fewer optional things, rather than more, so I  
> think we should either require a dedicated thread or make it non- 
> dedicated.  It should be quite straightforward to implement a  
> middleware component that ensures its wrappee is run entirely  
> within a dedicated thread, using a Queue.

You can't *require* the server to switch threads every iteration. In  
fact I'm willing to bet many servers will continue using a dedicated  
thread even if they're explicitly allowed to not do so. So having  
some indication as to which the server is doing might be helpful.

James

From fumanchu at amor.org  Fri Mar 30 19:32:19 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Fri, 30 Mar 2007 10:32:19 -0700
Subject: [Web-SIG] WSGI 2 and SERVER_PROTOCOL
Message-ID: <435DF58A933BA74397B42CDEB8145A860AA41BF3@ex9.hostedexchange.local>

RFC 2145 says:

  "An implementation of HTTP/x.b sending a message to a
   recipient whose version is known to be HTTP/x.a, a < b,
   MUST NOT depend on the recipient understanding a header
   not defined in the specification for HTTP/x.a.  For example,
   HTTP/1.0 clients cannot be expected to understand chunked
   encodings, and so an HTTP/1.1 server must never send
   "Transfer-Encoding: chunked" in response to an HTTP/1.0
   request."

In specific cases, implementations can choose to send some HTTP/1.1
headers to HTTP/1.0 clients, but in the general case, the solution is
usually to downgrade the entire HTTP response to 1.0 features only.

Under WSGI, "an implementation of HTTP/x.b" is an emergent property of
the entire stack; servers, middleware, and applications all share this
responsibility to downgrade the entire response to HTTP/1.0 features if
any of the other components is not HTTP/1.1 compliant.

Unfortunately, the WSGI 1.0 spec doesn't require WSGI servers to tell
WSGI applications what version of HTTP they support. If a WSGI origin
server "fails to satisfy one or more of the MUST or REQUIRED level
requirements for the protocols it implements" (as too many WSGI servers
do!), WSGI applications have no standardized way of knowing this, and
may output headers which contradict the version number output by the
WSGI server.

CherryPy hacks around this by having the origin server send a custom
entry in the WSGI environ called "ACTUAL_SERVER_PROTOCOL", which tells
the rest of the WSGI stack the version for which the origin server is at
least conditionally compliant:

    # Compare request and server HTTP protocol versions, in case our
    # server does not support the requested protocol. Limit our output
    # to min(req, server). We want the following output:
    #     request    server     actual written   supported response
    #     protocol   protocol  response protocol    feature set
    # a     1.0        1.0           1.0                1.0
    # b     1.0        1.1           1.1                1.0
    # c     1.1        1.0           1.0                1.0
    # d     1.1        1.1           1.1                1.1
    # Notice that, in (b), the response will be "HTTP/1.1" even though
    # the client only understands 1.0. RFC 2616 10.5.6 says we should
    # only return 505 if the _major_ version is different.
    rp = int(req_protocol[5]), int(req_protocol[7])
    sp = int(server.protocol[5]), int(server.protocol[7])
    if sp[0] != rp[0]:
        self.simple_response("505 HTTP Version Not Supported")
        return
    
    # Bah. "SERVER_PROTOCOL" is actually the REQUEST protocol.
    environ["SERVER_PROTOCOL"] = req_protocol
    
    # set a non-standard environ entry so the WSGI app can know what
    # the *real* server protocol is (and what features to support).
    # See http://www.faqs.org/rfcs/rfc2145.html.
    environ["ACTUAL_SERVER_PROTOCOL"] = server.protocol
    self.response_protocol = "HTTP/%s.%s" % min(rp, sp)

The "application-side" bits of CherryPy inspect this value (if present)
and perform the same min(rp, sp) calculation as the server in order to
determine which features to support.

WSGI 2 should, at the least, add a standard environ entry similar to
ACTUAL_SERVER_PROTOCOL. This would provide the minimum enforcement of
full-stack compliance, since WSGI origin servers tend to be the
least-compliant portions of any WSGI stack. As far as I am aware, the
CherryPy 3 wsgiserver is the only one currently claiming to be even
"conditionally compliant" with HTTP/1.1.

WSGI 2 might, in addition, require WSGI origin servers to perform the
min(rp, sp) calculation once and pass the result in a new
"RESPONSE_PROTOCOL_SUPPORT" environ entry. Note this is not necessarily
the same version number as what will be output in the response
Status-Line:

  "An HTTP server SHOULD send a response version equal to
   the highest version for which the server is at least
   conditionally compliant, and whose major version is
   less than or equal to the one received in the request.
   An HTTP server MUST NOT send a version for which it is
   not at least conditionally compliant.  A server MAY
   send a 505 (HTTP Version Not Supported) response if [it]
   cannot send a response using the major version used
   in the client's request."

If a given WSGI application or middleware component is not at least
conditionally compliant with HTTP/1.1, the WSGI origin server should
downgrade the response version it emits in the Status-Line, but has no
standardized way to be informed of this state of affairs. Currently, the
burden tends to fall on those who compose WSGI stacks to manually
instruct the WSGI origin server to always output HTTP/1.0 if any WSGI
component is not conditionally compliant with HTTP/1.1. This issue may
need to be addressed in a separate spec covering the composition of WSGI
stacks.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From ianb at colorstudy.com  Fri Mar 30 19:42:29 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 30 Mar 2007 12:42:29 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com>
Message-ID: <460D4C05.9040404@colorstudy.com>

Phillip J. Eby wrote:
> I was thinking of something a bit simpler; the environ key would be an 
> object that, when called, tells the server that it's okay to resume 
> iteration attempts on the application.  A sort of "put me back on the 
> queue for iteration" call.  The callback would have to be safe to call 
> from any thread at any time, and must not re-enter anything, just 
> re-enable iteration.

OK, that makes sense.  So there's something like 
environ['wsgi.server_resume'] in the environment, and the app yields 
something that indicates a pause, then calls that value to undo the pause?

>>> Ideally, this should be done in a way that's easy for middleware to 
>>> handle; a flush signal should be handled by the middleware *and* 
>>> passed up the chain, while any other async signals would be passed 
>>> directly up the chain (unless it's something like "pause for input" 
>>> and the middleware controls the input).
>>> If we do this right, it should be easier to write middleware that 
>>> works correctly with respect to buffering, since the issues of 
>>> flushing and pausing now become explicit rather than implicit.  (This 
>>> should make it easier to teach/learn as well.)
>>
>> In terms of buffering, I can't think of many cases where it would 
>> matter.  Either the middleware passes back the response with no 
>> changes, or it needs to consume the entire response body (and probably 
>> headers and maybe status) to do whatever transformation it needs to do.
>>
>> Things like pauses and async signals would ideally be passed upstream, 
>> but flushes and content would all be consumed by the middleware.
> 
> I can't think of any condition where middleware would *not* pass all of 
> these up to its caller.  In the case of a "flush", it needs to first 
> yield any buffered output, but it *must* still yield the flush.

Is there any use to this?  If you are transforming output, the flush is 
unlikely to flush anything; all output will be buffered.

> For example, if you're doing server push, then the app should yield a 
> flush prior to each new content boundary.  If the middleware is doing 
> compression or some such, then it needs to restart encoding after each 
> content boundary, as well as flush the prior encoded output.

I suppose server push is the only place where flush really matters, and 
most output transformations will simply break server push.  As long as 
the async signals are easy to detect (e.g., an integer or tuple) then 
that's fine.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
             | Write code, do good | http://topp.openplans.org/careers

From pje at telecommunity.com  Fri Mar 30 20:14:57 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 30 Mar 2007 13:14:57 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <460D4C05.9040404@colorstudy.com>
References: <5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070329212515.02c0f008@sparrow.telecommunity.com>
	<5.1.1.6.0.20070330115237.02cb6ee8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070330125052.042c64d0@sparrow.telecommunity.com>

At 12:42 PM 3/30/2007 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>I was thinking of something a bit simpler; the environ key would be an 
>>object that, when called, tells the server that it's okay to resume 
>>iteration attempts on the application.  A sort of "put me back on the 
>>queue for iteration" call.  The callback would have to be safe to call 
>>from any thread at any time, and must not re-enter anything, just 
>>re-enable iteration.
>
>OK, that makes sense.  So there's something like 
>environ['wsgi.server_resume'] in the environment, and the app yields 
>something that indicates a pause, then calls that value to undo the pause?

Yep.  I guess we should distinguish here between "pause but poll" and 
"pause and wait for the callback".  i.e., the operations might be something 
like:

PAUSE_AND_POLL
PAUSE_AND_WAIT
FLUSH


>>>>Ideally, this should be done in a way that's easy for middleware to 
>>>>handle; a flush signal should be handled by the middleware *and* passed 
>>>>up the chain, while any other async signals would be passed directly up 
>>>>the chain (unless it's something like "pause for input" and the 
>>>>middleware controls the input).
>>>>If we do this right, it should be easier to write middleware that works 
>>>>correctly with respect to buffering, since the issues of flushing and 
>>>>pausing now become explicit rather than implicit.  (This should make it 
>>>>easier to teach/learn as well.)
>>>
>>>In terms of buffering, I can't think of many cases where it would 
>>>matter.  Either the middleware passes back the response with no changes, 
>>>or it needs to consume the entire response body (and probably headers 
>>>and maybe status) to do whatever transformation it needs to do.
>>>
>>>Things like pauses and async signals would ideally be passed upstream, 
>>>but flushes and content would all be consumed by the middleware.
>>I can't think of any condition where middleware would *not* pass all of 
>>these up to its caller.  In the case of a "flush", it needs to first 
>>yield any buffered output, but it *must* still yield the flush.
>
>Is there any use to this?  If you are transforming output, the flush is 
>unlikely to flush anything; all output will be buffered.

That depends on whether the transformation is of a streaming nature.  If 
you're talking about things that e.g. apply XSL or some such, those are 
probably really MFCs rather than true middleware, and it's okay for an MFC 
to have more constraints on its wrapped application than transparent 
middleware does.


>>For example, if you're doing server push, then the app should yield a 
>>flush prior to each new content boundary.  If the middleware is doing 
>>compression or some such, then it needs to restart encoding after each 
>>content boundary, as well as flush the prior encoded output.
>
>I suppose server push is the only place where flush really matters, and 
>most output transformations will simply break server push.

More precisely, they should just not apply their transformations to a 
multipart content type, unless they know how to handle it.

However, there is another place where flow control matters, and that is 
streaming files which are too large to practically buffer in memory.  Such 
files need a way to "suggest" that they be split into smaller blocks.

Having a requirement that flow control be passed through allows us to 
ensure that middleware doesn't try to consume the whole response, you see.

In WSGI 1.0, we handle this by treating *every* block as if it were 
followed by a flush, but in 2.0 I'd like to accomodate the fact that many 
people seem to think that yielding is like using "print" in CGI.

I'm not married to the specific mechanism we use, but I *would* like to see 
WSGI 2.0 make it easy for middleware authors to comply in such a way as to 
handle streaming and push correctly.

Hm.  Maybe what we need is a way to specify the *type* of response, so that 
middleware can ignore what it can't handle...  e.g.:

    def simple_app(environ)
        return resp_type, status, headers, content

Then if the response type is STREAM or ASYNC, the middleware could opt out 
of it, returning the response as-is.

OTOH, adding an extra return value seems like a pain when so few 
applications would use it, and so little middleware would care.  Maybe it 
would be better to add something to the start of the status string, 
instead?  E.g. "if status.startswith('!'): return original_response"?


>   As long as the async signals are easy to detect (e.g., an integer or 
> tuple) then that's fine.
>
>--
>Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
>             | Write code, do good | http://topp.openplans.org/careers