From python at venix.com  Mon Mar 14 15:29:25 2005
From: python at venix.com (Lloyd Kvam)
Date: Mon Mar 14 15:29:30 2005
Subject: [Web-SIG] sending multi-part responses from the web server
Message-ID: <1110810564.22679.33.camel@laptop.venix.com>

I have a web site that generates a document for download "on the fly". 
It is requested through a submit button.  I also generate the response
html along with the download.  To create the multi-part response I used
the email modules MIME support.

Everything works, but to prevent quoted-printable encoding of the HTML,
I sued this line of code:
        body.add_payload(MIMEText(
		html,		# this is the unicode response
		_subtype='html',
		_charset=None))
body is the MIMEMultipart container.

Should I have used a different MIME module or a different approach?

-- 
Lloyd Kvam
Venix Corp

From python at venix.com  Wed Mar 16 21:05:25 2005
From: python at venix.com (Lloyd Kvam)
Date: Wed Mar 16 21:05:28 2005
Subject: [Web-SIG] sending multi-part responses from the web server
In-Reply-To: <1110810564.22679.33.camel@laptop.venix.com>
References: <1110810564.22679.33.camel@laptop.venix.com>
Message-ID: <1111003525.4272.193.camel@laptop.venix.com>

For anyone who is interested, multipart responses do not work with
Internet Explorer or Safari.  I am now back to doing things the
old-fashioned way with separate download and html responses with simple
"hand-built" headers


On Mon, 2005-03-14 at 09:29, Lloyd Kvam wrote:
> I have a web site that generates a document for download "on the fly". 
> It is requested through a submit button.  I also generate the response
> html along with the download.  To create the multi-part response I used
> the email modules MIME support.
> 
> Everything works, but to prevent quoted-printable encoding of the HTML,
> I sued this line of code:
>         body.add_payload(MIMEText(
> 		html,		# this is the unicode response
> 		_subtype='html',
> 		_charset=None))
> body is the MIMEMultipart container.
> 
> Should I have used a different MIME module or a different approach?
-- 

Lloyd Kvam
Venix Corp.
1 Court Street, Suite 378
Lebanon, NH 03766-1358

voice:	603-653-8139
fax:	320-210-3409
-- 
Lloyd Kvam
Venix Corp

From gvwilson at cs.utoronto.ca  Tue Mar 22 14:17:20 2005
From: gvwilson at cs.utoronto.ca (Greg Wilson)
Date: Tue Mar 22 14:33:49 2005
Subject: [Web-SIG] teaching web programming with Python?
Message-ID: <d1p5oa$kp$1@sea.gmane.org>

I'd be interested in hearing from anyone who is teaching/has taught 
university-level courses on web programming using Python; I'm going to 
be giving CSC309 at the University of Toronto this summer, and would 
like to give Python equal time with Java.

Thanks,
Greg

From sridharinfinity at gmail.com  Tue Mar 22 16:42:59 2005
From: sridharinfinity at gmail.com (Sridhar Ratna)
Date: Tue Mar 22 16:43:02 2005
Subject: [Web-SIG] teaching web programming with Python?
In-Reply-To: <d1p5oa$kp$1@sea.gmane.org>
References: <d1p5oa$kp$1@sea.gmane.org>
Message-ID: <8816fcf805032207425f24b93d@mail.gmail.com>

On Tue, 22 Mar 2005 08:17:20 -0500, Greg Wilson <gvwilson@cs.utoronto.ca> wrote:
> I'd be interested in hearing from anyone who is teaching/has taught
> university-level courses on web programming using Python; I'm going to
> be giving CSC309 at the University of Toronto this summer, and would
> like to give Python equal time with Java.
> 

I suggest you to teach Python before Java in any case. Only then the
students will appreciate the real differences between both the
languages.

-- 
Sridhar Ratna - http://www.livejournal.com/users/seedar/
From jjinux at gmail.com  Thu Mar 24 08:10:50 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Thu Mar 24 08:10:53 2005
Subject: [Web-SIG] Aquarium
Message-ID: <c41f67b905032323101eb3e615@mail.gmail.com>

Hi!

I'm the author of Aquarium <http://aquarium.sf.net>.  I feel like an
idiot for not joining this mailing list years ago!  By the way, I'm
quite impressed with whoever did
<http://www.python.org/moin/Aquarium>.  There are so many Web
application frameworks, servers, templating engines, etc. that
understanding a project well enough to summarize it in this way is
really quite impressive.

Best Regards,
-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.
From ianb at colorstudy.com  Mon Mar 28 00:49:20 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon Mar 28 00:49:03 2005
Subject: [Web-SIG] A query for hosting providers
Message-ID: <42473870.6070502@colorstudy.com>

I'm wondering -- and this is mostly directed to the hosting providers 
(Remi, Sean...) -- what are the problems with providing commodity-level 
hosting for Python programs?  I can think of some, but I'm curious what 
you've encountered and if you have ideas about how to improve things.

Some things I've thought about:
* Long running processes are hard to maintain (assuming we rule out 
CGI).  Code becomes stale, maybe the server process gets in a bad state. 
   Sometimes processes becomes wedged.  With mod_python this can effect 
the entire site.
* Isolating clients from each other can be difficult.  For mod_python 
I'm assuming each client needs their own Apache server.  Maybe this 
isn't as much of a problem these days, as virtualizing technologies have 
improved, and multiple Apache processes isn't that big of a deal.
* Setup of frameworks is all over the place.  Setting up multiple 
frameworks might be even more difficult.  Some of them may depend on 
mod_rewrite.  Server processes are all over the place as well.

But I don't have a real feeling for how to solve these, and I'm sure 
there's things I'm not thinking about.  How do you guys do it now, and 
if you could change this stuff -- on any level, from interpreter to 
framework -- what would you do?

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From mike_mp at zzzcomputing.com  Mon Mar 28 02:54:24 2005
From: mike_mp at zzzcomputing.com (michael bayer)
Date: Mon Mar 28 02:54:30 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <42473870.6070502@colorstudy.com>
References: <42473870.6070502@colorstudy.com>
Message-ID: <6bd1762989d9f2a7848aa5e5732b5951@zzzcomputing.com>

if you run multiple virtual hosts out of Apache, mod_python by default  
creates new subinterpreters, via Py_NewInterpreter, for each virtual  
host.  this can also be set up per apache directive, an arbitrary name,  
or within individual directories.

although how well Py_NewInterpreter separates each interpreter from  
each other, I am less certain of....though Im sure a particular  
subinterpreter would have to have a pretty catastrophic failure (i.e.  
segfault or similar) to affect the parent and/or siblings.

I am running multiple interpreters myself but it hasnt been heavily  
stress-tested.

On Mar 27, 2005, at 5:49 PM, Ian Bicking wrote:

> I'm wondering -- and this is mostly directed to the hosting providers  
> (Remi, Sean...) -- what are the problems with providing  
> commodity-level hosting for Python programs?  I can think of some, but  
> I'm curious what you've encountered and if you have ideas about how to  
> improve things.
>
> Some things I've thought about:
> * Long running processes are hard to maintain (assuming we rule out  
> CGI).  Code becomes stale, maybe the server process gets in a bad  
> state.   Sometimes processes becomes wedged.  With mod_python this can  
> effect the entire site.
> * Isolating clients from each other can be difficult.  For mod_python  
> I'm assuming each client needs their own Apache server.  Maybe this  
> isn't as much of a problem these days, as virtualizing technologies  
> have improved, and multiple Apache processes isn't that big of a deal.
> * Setup of frameworks is all over the place.  Setting up multiple  
> frameworks might be even more difficult.  Some of them may depend on  
> mod_rewrite.  Server processes are all over the place as well.
>
> But I don't have a real feeling for how to solve these, and I'm sure  
> there's things I'm not thinking about.  How do you guys do it now, and  
> if you could change this stuff -- on any level, from interpreter to  
> framework -- what would you do?
>
> --  
> Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
> _______________________________________________
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:  
> http://mail.python.org/mailman/options/web-sig/ 
> mike_mp%40zzzcomputing.com
From ianb at colorstudy.com  Mon Mar 28 03:00:57 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon Mar 28 03:00:40 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <6bd1762989d9f2a7848aa5e5732b5951@zzzcomputing.com>
References: <42473870.6070502@colorstudy.com>
	<6bd1762989d9f2a7848aa5e5732b5951@zzzcomputing.com>
Message-ID: <42475749.60700@colorstudy.com>

michael bayer wrote:
> if you run multiple virtual hosts out of Apache, mod_python by default  
> creates new subinterpreters, via Py_NewInterpreter, for each virtual  
> host.  this can also be set up per apache directive, an arbitrary name,  
> or within individual directories.
> 
> although how well Py_NewInterpreter separates each interpreter from  
> each other, I am less certain of....though Im sure a particular  
> subinterpreter would have to have a pretty catastrophic failure (i.e.  
> segfault or similar) to affect the parent and/or siblings.

One thing I've heard people mentioned is that, if multiple processors 
really work well, it would be nice to have access to them from Python as 
well as C.  Has anyone thought more deeply about that, and if it would 
work well.  Could it be the basis of restricted execution environments 
too?  Multiple interpreters seem like a hidden feature of Python, but 
I'm not sure why... it seems like a useful capability.

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From python at venix.com  Mon Mar 28 16:34:49 2005
From: python at venix.com (Lloyd Kvam)
Date: Mon Mar 28 16:35:41 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <42473870.6070502@colorstudy.com>
References: <42473870.6070502@colorstudy.com>
Message-ID: <1112020489.19670.42.camel@laptop.venix.com>

Speaking as a Tummy.com customer, I have a virtualized linux server.  My
processes are isolated from Tummy's other clients on that piece of
hardware at the OS level.  I still face the same issues with
long-running processes, but at least it is all from code that I maintain
on a server image that I manage.  If I had competing processes that were
hard to reconcile, I'd probably just get another virtual server from
Tummy.

On Sun, 2005-03-27 at 17:49, Ian Bicking wrote:
> I'm wondering -- and this is mostly directed to the hosting providers 
> (Remi, Sean...) -- what are the problems with providing commodity-level 
> hosting for Python programs?  I can think of some, but I'm curious what 
> you've encountered and if you have ideas about how to improve things.
> 
> Some things I've thought about:
> * Long running processes are hard to maintain (assuming we rule out 
> CGI).  Code becomes stale, maybe the server process gets in a bad state. 
>    Sometimes processes becomes wedged.  With mod_python this can effect 
> the entire site.
> * Isolating clients from each other can be difficult.  For mod_python 
> I'm assuming each client needs their own Apache server.  Maybe this 
> isn't as much of a problem these days, as virtualizing technologies have 
> improved, and multiple Apache processes isn't that big of a deal.
> * Setup of frameworks is all over the place.  Setting up multiple 
> frameworks might be even more difficult.  Some of them may depend on 
> mod_rewrite.  Server processes are all over the place as well.
> 
> But I don't have a real feeling for how to solve these, and I'm sure 
> there's things I'm not thinking about.  How do you guys do it now, and 
> if you could change this stuff -- on any level, from interpreter to 
> framework -- what would you do?
-- 
Lloyd Kvam
Venix Corp

From janssen at parc.com  Tue Mar 29 01:47:24 2005
From: janssen at parc.com (Bill Janssen)
Date: Tue Mar 29 01:47:53 2005
Subject: [Web-SIG] A query for hosting providers 
In-Reply-To: Your message of "Sun, 27 Mar 2005 14:49:20 PST."
	<42473870.6070502@colorstudy.com> 
Message-ID: <05Mar28.154729pst."58617"@synergy1.parc.xerox.com>

> * Long running processes are hard to maintain (assuming we rule out 
> CGI).  Code becomes stale, maybe the server process gets in a bad state. 
>    Sometimes processes becomes wedged.  With mod_python this can effect 
> the entire site.

I've been extremely impressed at how well Python's VM does at this.  I
run Medusa-based services for months at a time without trouble -- in
fact, they run fine till the machine is rebooted.  These servers are
doing multithreaded text and graphics manipulation with regular
expressions and PIL.  They often run Linux scripts in subprocesses.
Wedges are extremely rare, and I have yet to see one caused by Python
code.

Bill
From remi at cherrypy.org  Tue Mar 29 11:43:55 2005
From: remi at cherrypy.org (Remi Delon)
Date: Tue Mar 29 11:43:48 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <42473870.6070502@colorstudy.com>
References: <42473870.6070502@colorstudy.com>
Message-ID: <4249235B.2070707@cherrypy.org>

> I'm wondering -- and this is mostly directed to the hosting providers 
> (Remi, Sean...) -- what are the problems with providing commodity-level 
> hosting for Python programs?  I can think of some, but I'm curious what 
> you've encountered and if you have ideas about how to improve things.
> 
> Some things I've thought about:
> * Long running processes are hard to maintain (assuming we rule out 
> CGI).  Code becomes stale, maybe the server process gets in a bad state. 
>   Sometimes processes becomes wedged.  With mod_python this can effect 
> the entire site.

Yes, maintaining long-running processes can be a pain, but that's not 
related to python itself, it's true regardless of the language that was 
used to write the program.

> * Isolating clients from each other can be difficult.  For mod_python 
> I'm assuming each client needs their own Apache server.

Yes, that's how we ended up setting up our mod_python accounts.
We also found stability problems in some of the other mod_* modules 
(mod_webkit, mod_skunkweb, ...) and they sometimes crashed the main 
Apache server (very bad). So for all the frameworks that support a 
standalone HTTP server mode (CherryPy, Webware, Skunkweb, ...) we now 
set them up as standalone HTTP server listening on a local port, and we 
just use our main Apache server as a proxy to these servers.
This allows us to use the trick described on this page: 
http://www.cherrypy.org/wiki/BehindApache (look for "autostart.cgi") to 
have Apache restart the server automatically if it ever goes down.

>  Maybe this 
> isn't as much of a problem these days, as virtualizing technologies have 
> improved, and multiple Apache processes isn't that big of a deal.
> * Setup of frameworks is all over the place.  Setting up multiple 
> frameworks might be even more difficult.  Some of them may depend on 
> mod_rewrite.  Server processes are all over the place as well.
> 
> But I don't have a real feeling for how to solve these, and I'm sure 
> there's things I'm not thinking about.

Well, the 2 main problems that I can think of are:
     - Python frameworks tend to work as long-running processes, which 
have a lot of advantages for your site, but are a nightmare for hosting 
providers. There are soooo many things to watch for: CPU usage (a 
process can start "spinning"), RAM usage, process crashing, ... But that 
is not related to python and any hosting provider that supports 
long-running processes face the same challenge. For instance, we support 
Tomcat and the problems are the same. For this we ended up writing a lot 
of custom monitoring scripts on our own (we couldn't find exactly what 
we needed out there). Fortunately, python makes it easy to write these 
scripts :-)
     - But another challenge (and this one is more specific to Python) 
is the number of python versions and third party modules that we have to 
support. For instance, at Python-Hosting.com, we have to support all 4 
versions of python: 2.1, 2.2, 2.3 and 2.4, and all of them are being 
used by various people. And for each version, we usually have 10 to 20 
third-party modules (mysql-python, psycopg, elementtree, sqlobject, ...) 
that people need ! We run Red Hat Enterprise 3, but RPMs for python are 
not designed to work with multiple python versions installed, and RPMs 
for third-party modules are usually inexistent. As a result, we have to 
build all the python-related stuff from source. And some of these 
modules are sometimes hard to build (the python-subversion bindings for 
instance) and you can run into some library-version-compatibility 
nightmare. And as if this wasn't enough, new releases of modules come 
out everyday ...
I think that this second point is the main challenge and any hosting 
provider that is not specialized in python doesn't have the time or the 
knowledge to build and maintain all these python versions and 
third-party modules. Of course, they could just say "we're going to 
support this specific python version with these few third-party modules 
and that's it", but experience shows that most people need at least one 
or 2 "uncommon" third-party modules for their site so if that module is 
missing they just can't run their site ...

But above all, I think that the main reason why python frameworks are 
not more commonly supported by the big hosting providers is because the 
market for these frameworks is very small (apart from Zope/Plone). For 
all the "smaller" frameworks (CherryPy, Webware, SkunkWeb, Quixote, ...) 
we host less than 50 of each, so the big hosting providers simply won't 
bother learning these frameworks and supporting them for such a small 
market.

--
Remi / http://www.python-hosting.com
From sridharinfinity at gmail.com  Tue Mar 29 15:03:08 2005
From: sridharinfinity at gmail.com (Sridhar Ratna)
Date: Tue Mar 29 15:03:12 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <4249235B.2070707@cherrypy.org>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
Message-ID: <8816fcf8050329050377d0ab76@mail.gmail.com>

On Tue, 29 Mar 2005 10:43:55 +0100, Remi Delon <remi@cherrypy.org> wrote:
> This allows us to use the trick described on this page:
> http://www.cherrypy.org/wiki/BehindApache (look for "autostart.cgi") to
> have Apache restart the server automatically if it ever goes down.
> 

A main disadvantage of using apache to start the HTTP server is
process UID.  The HTTP server will be started under the UID of the
webserver. 'suid' is a security risk as it requires apache to be run
as root.


-- 
Sridhar Ratna - http://www.livejournal.com/users/seedar/
From remi at cherrypy.org  Tue Mar 29 15:09:01 2005
From: remi at cherrypy.org (Remi Delon)
Date: Tue Mar 29 15:08:56 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <8816fcf8050329050377d0ab76@mail.gmail.com>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
	<8816fcf8050329050377d0ab76@mail.gmail.com>
Message-ID: <4249536D.4010709@cherrypy.org>

>>This allows us to use the trick described on this page:
>>http://www.cherrypy.org/wiki/BehindApache (look for "autostart.cgi") to
>>have Apache restart the server automatically if it ever goes down.
>>
> 
> 
> A main disadvantage of using apache to start the HTTP server is
> process UID.  The HTTP server will be started under the UID of the
> webserver. 'suid' is a security risk as it requires apache to be run
> as root.

Our Apache servers run as nobody and we use SuExec so the other HTTP 
servers get started as the customer they belongs to.

Remi.

From ianb at colorstudy.com  Tue Mar 29 19:27:20 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Mar 29 19:28:35 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <4249235B.2070707@cherrypy.org>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
Message-ID: <42498FF8.6030407@colorstudy.com>

Remi Delon wrote:
>> I'm wondering -- and this is mostly directed to the hosting providers 
>> (Remi, Sean...) -- what are the problems with providing 
>> commodity-level hosting for Python programs?  I can think of some, but 
>> I'm curious what you've encountered and if you have ideas about how to 
>> improve things.
>>
>> Some things I've thought about:
>> * Long running processes are hard to maintain (assuming we rule out 
>> CGI).  Code becomes stale, maybe the server process gets in a bad 
>> state.   Sometimes processes becomes wedged.  With mod_python this can 
>> effect the entire site.
> 
> 
> Yes, maintaining long-running processes can be a pain, but that's not 
> related to python itself, it's true regardless of the language that was 
> used to write the program.
> 
>> * Isolating clients from each other can be difficult.  For mod_python 
>> I'm assuming each client needs their own Apache server.
> 
> 
> Yes, that's how we ended up setting up our mod_python accounts.
> We also found stability problems in some of the other mod_* modules 
> (mod_webkit, mod_skunkweb, ...) and they sometimes crashed the main 
> Apache server (very bad). So for all the frameworks that support a 
> standalone HTTP server mode (CherryPy, Webware, Skunkweb, ...) we now 
> set them up as standalone HTTP server listening on a local port, and we 
> just use our main Apache server as a proxy to these servers.
> This allows us to use the trick described on this page: 
> http://www.cherrypy.org/wiki/BehindApache (look for "autostart.cgi") to 
> have Apache restart the server automatically if it ever goes down.

On our own servers we've been using CGI connectors (wkcgi, Zope.cgi), 
which seem fast enough, and of course won't be crashing Apache.

Have you looked at Supervisor for long running processes?
   http://www.plope.com/software/supervisor/
I haven't had a chance to use it, but it looks useful for this sort of 
thing.

HTTP does seem like a reasonable way to communicate between servers, 
instead of all these ad hoc HTTP-like protocols (PCGI, SCGI, FastCGI, 
mod_webkit, etc).  My only disappointment with that technique is that 
you lose some context -- e.g., if REMOTE_USER is set, or 
SCRIPT_NAME/PATH_INFO (you probably have to configure your URLs, since 
they aren't detectable), mod_rewrite's additional environmental 
variables, etc.  Hmm... I notice you use custom headers for that 
(CP-Location), and I suppose other variables could also be passed 
through... it's just unfortunate because that significantly adds to the 
Apache configuration, which is something I try to avoid -- it's easy 
enough to put in place, but hard to maintain.

>>  Maybe this isn't as much of a problem these days, as virtualizing 
>> technologies have improved, and multiple Apache processes isn't that 
>> big of a deal.
>> * Setup of frameworks is all over the place.  Setting up multiple 
>> frameworks might be even more difficult.  Some of them may depend on 
>> mod_rewrite.  Server processes are all over the place as well.
>>
>> But I don't have a real feeling for how to solve these, and I'm sure 
>> there's things I'm not thinking about.
> 
> 
> Well, the 2 main problems that I can think of are:
>     - Python frameworks tend to work as long-running processes, which 
> have a lot of advantages for your site, but are a nightmare for hosting 
> providers. There are soooo many things to watch for: CPU usage (a 
> process can start "spinning"), RAM usage, process crashing, ... But that 
> is not related to python and any hosting provider that supports 
> long-running processes face the same challenge. For instance, we support 
> Tomcat and the problems are the same. For this we ended up writing a lot 
> of custom monitoring scripts on our own (we couldn't find exactly what 
> we needed out there). Fortunately, python makes it easy to write these 
> scripts :-)

Do you do monitoring on a per-process basis (like a supervisor process) 
or just globally scan through the processes and kill off any bad ones? 
I've though that a forking server with a parent that monitored children 
carefully would be nice, which would be kind of a per-process monitor. 
It would mean I'd have to start thinking multiprocess, reversing all my 
threaded habits, but I think I'm willing to do that in return for really 
good reliability.

>     - But another challenge (and this one is more specific to Python) is 
> the number of python versions and third party modules that we have to 
> support. For instance, at Python-Hosting.com, we have to support all 4 
> versions of python: 2.1, 2.2, 2.3 and 2.4, and all of them are being 
> used by various people. And for each version, we usually have 10 to 20 
> third-party modules (mysql-python, psycopg, elementtree, sqlobject, ...) 
> that people need ! We run Red Hat Enterprise 3, but RPMs for python are 
> not designed to work with multiple python versions installed, and RPMs 
> for third-party modules are usually inexistent. As a result, we have to 
> build all the python-related stuff from source. And some of these 
> modules are sometimes hard to build (the python-subversion bindings for 
> instance) and you can run into some library-version-compatibility 
> nightmare. And as if this wasn't enough, new releases of modules come 
> out everyday ...

For the apps I've been deploying internally -- where we have both a more 
controlled and less controlled environment than a commercial host -- 
I've been installing every prerequesite in a per-application location, 
i.e., ``python setup.py install --install-lib=app/stdlib``.  Python 
module versioning issues are just too hard to resolve, and I'd rather 
leave standard-packages with only really stable software that I don't 
often need to update (like mxDateTime), and put everything else next to 
the application.

> I think that this second point is the main challenge and any hosting 
> provider that is not specialized in python doesn't have the time or the 
> knowledge to build and maintain all these python versions and 
> third-party modules. Of course, they could just say "we're going to 
> support this specific python version with these few third-party modules 
> and that's it", but experience shows that most people need at least one 
> or 2 "uncommon" third-party modules for their site so if that module is 
> missing they just can't run their site ...

Any reason for all the Python versions?  Well, I guess it's hard to ask 
clients to upgrade.  If I was to support people in that way, I'd 
probably try to standardize a Python version or two, and some core 
modules (probably the ones that are harder to build, like database 
drivers), and ask users to install everything else in their own 
environment.  But of course when you are in service you have to do what 
people want you to do...

> But above all, I think that the main reason why python frameworks are 
> not more commonly supported by the big hosting providers is because the 
> market for these frameworks is very small (apart from Zope/Plone). For 
> all the "smaller" frameworks (CherryPy, Webware, SkunkWeb, Quixote, ...) 
> we host less than 50 of each, so the big hosting providers simply won't 
> bother learning these frameworks and supporting them for such a small 
> market.

If they could support all of them at once, do you think it would be more 
interesting to hosting providers?

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From jjinux at gmail.com  Wed Mar 30 06:29:02 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed Mar 30 06:40:46 2005
Subject: [Web-SIG] Welcome to JJinuxLand
Message-ID: <c41f67b9050329202949fda267@mail.gmail.com>

Karl Guertin told me, "He who blogs gets his opinion heard." Hence,
inspired by the blogs of Anthony Eden, Leon Atkinson, and Ian Bicking,
I have created my own. This is a purely technical blog concerning
topics such as Python, Linux, FreeBSD, open source software, the Web,
and lesser-known programming languages.  In the interest of everyone's
time, including my own, I hope to keep the volume low, and the
technical content high.

Please excuse the fact that I did not write my own blog software. It
isn't that I am unable to. It's simply that I hate reinventing the
wheel when exceedingly round ones already exist. I chose blogger.com
specifically on the recommendation of Krishna Srinivasan, Tung Wai
Yip, Anthony Eden, and Google.  I look forward to your comments.

<http://jjinux.blogspot.com/>
From ianb at colorstudy.com  Wed Mar 30 06:54:21 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Mar 30 06:53:58 2005
Subject: [Web-SIG] WSGI start_response exc_info argument
Message-ID: <424A30FD.6050302@colorstudy.com>

I realize I've been making invalid WSGI middleware for a while now.  I 
guess I kind of knew that I was.  Anyway, reviewing the spec again and 
looking at the exc_info argument to start_response, I feel a little 
unsure.  I think I actually somehow got that argument in there by way of 
some argument I made, but I can't remember what, and it doesn't make 
sense to me now.  Relevent sections:

http://www.python.org/peps/pep-0333.html#the-start-response-callable
http://www.python.org/peps/pep-0333.html#error-handling

It seems like, in the small number of cases where this matters 
(basically error catching middleware and actual servers) it's easy 
enough to just code this up normally, I guess I don't see why the extra 
argument is needed to pass the error up the stack...?

-- 
Ian Bicking  /  ianb@colorstudy.com  / http://blog.ianbicking.org
From pje at telecommunity.com  Wed Mar 30 18:49:09 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Mar 30 18:45:14 2005
Subject: [Web-SIG] WSGI start_response exc_info argument
In-Reply-To: <424A30FD.6050302@colorstudy.com>
Message-ID: <5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>

At 10:54 PM 3/29/05 -0600, Ian Bicking wrote:
>I realize I've been making invalid WSGI middleware for a while now.  I 
>guess I kind of knew that I was.  Anyway, reviewing the spec again and 
>looking at the exc_info argument to start_response, I feel a little 
>unsure.  I think I actually somehow got that argument in there by way of 
>some argument I made, but I can't remember what, and it doesn't make sense 
>to me now.

IIRC, it wasn't you, but Tony Lownds.


>   Relevent sections:
>
>http://www.python.org/peps/pep-0333.html#the-start-response-callable
>http://www.python.org/peps/pep-0333.html#error-handling
>
>It seems like, in the small number of cases where this matters (basically 
>error catching middleware and actual servers) it's easy enough to just 
>code this up normally, I guess I don't see why the extra argument is 
>needed to pass the error up the stack...?

That's not the use case.  The parameter exists so error handling code 
doesn't have to care whether start_response has already been called.  Thus, 
applications and middleware can be simpler because they don't need to track 
that bit of state information that the server is already tracking.  So, 
calling start_response when it has already been called causes the error 
handler to abort and fall back to the next higher error handler, all the 
way up to the "real" server.  IOW, it's a way of guaranteeing immediate 
request termination if an error occurs once the response has begun.

Of course, any logging or notification error handlers in the stack will 
receive the error in the normal way; it's just that if they also try to 
start a response, they'll be aborted and the error will bubble up to the 
next handler.  Does that make more sense now?


From ianb at colorstudy.com  Wed Mar 30 18:57:34 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Mar 30 18:58:34 2005
Subject: [Web-SIG] WSGI start_response exc_info argument
In-Reply-To: <5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
References: <5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
Message-ID: <424ADA7E.8070001@colorstudy.com>

Phillip J. Eby wrote:
> That's not the use case.  The parameter exists so error handling code 
> doesn't have to care whether start_response has already been called.  
> Thus, applications and middleware can be simpler because they don't need 
> to track that bit of state information that the server is already 
> tracking.  So, calling start_response when it has already been called 
> causes the error handler to abort and fall back to the next higher error 
> handler, all the way up to the "real" server.  IOW, it's a way of 
> guaranteeing immediate request termination if an error occurs once the 
> response has begun.
> 
> Of course, any logging or notification error handlers in the stack will 
> receive the error in the normal way; it's just that if they also try to 
> start a response, they'll be aborted and the error will bubble up to the 
> next handler.  Does that make more sense now?

I guess, but it seems to complicate most middleware for the benefit of a 
small number of middlewares.  My current middleware (all of which is 
written ignorant of this argument) does something like:

class ErrorMiddleware(object):

     def __init__(self, application, show_exceptions=True,
                  email_exceptions_to=[], smtp_server='localhost'):
         self.application = application
         self.show_exceptions = show_exceptions
         self.email_exceptions_to = email_exceptions_to
         self.smtp_server = smtp_server

     def __call__(self, environ, start_response):
         # We want to be careful about not sending headers twice,
         # and the content type that the app has committed to (if there
         # is an exception in the iterator body of the response)
         started = []

         def detect_start_response(status, headers):
             started.append(True)
             return start_response(status, headers)

         try:
             app_iter = self.application(environ, detect_start_response)
             return self.catching_iter(app_iter, environ)
         except:
             if not started:
                 start_response('500 Internal Server Error',
                                [('content-type', 'text/html')])
             # @@: it would be nice to deal with bad content types here
             dummy_file = StringIO()
             response = self.exception_handler(sys.exc_info(), environ)
             return [response]


It really should capture the headers, and maybe buffer them itself (in 
which case it would also have to intercept the writer), so that it can 
deal more gracefully with a case where content type is set or something. 
  But all that annoying stuff is better kept to this one piece of 
middleware, instead of making everything more difficult with that extra 
argument to start_response.

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From pje at telecommunity.com  Wed Mar 30 21:22:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Mar 30 21:18:23 2005
Subject: [Web-SIG] WSGI start_response exc_info argument
In-Reply-To: <424ADA7E.8070001@colorstudy.com>
References: <5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
	<5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050330141819.034371c0@mail.telecommunity.com>

At 10:57 AM 3/30/05 -0600, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>That's not the use case.  The parameter exists so error handling code 
>>doesn't have to care whether start_response has already been called.
>>Thus, applications and middleware can be simpler because they don't need 
>>to track that bit of state information that the server is already 
>>tracking.  So, calling start_response when it has already been called 
>>causes the error handler to abort and fall back to the next higher error 
>>handler, all the way up to the "real" server.  IOW, it's a way of 
>>guaranteeing immediate request termination if an error occurs once the 
>>response has begun.
>>Of course, any logging or notification error handlers in the stack will 
>>receive the error in the normal way; it's just that if they also try to 
>>start a response, they'll be aborted and the error will bubble up to the 
>>next handler.  Does that make more sense now?
>
>I guess, but it seems to complicate most middleware for the benefit of a 
>small number of middlewares.  My current middleware (all of which is 
>written ignorant of this argument) does something like:
>
>class ErrorMiddleware(object):
>
>     def __init__(self, application, show_exceptions=True,
>                  email_exceptions_to=[], smtp_server='localhost'):
>         self.application = application
>         self.show_exceptions = show_exceptions
>         self.email_exceptions_to = email_exceptions_to
>         self.smtp_server = smtp_server
>
>     def __call__(self, environ, start_response):
>         # We want to be careful about not sending headers twice,
>         # and the content type that the app has committed to (if there
>         # is an exception in the iterator body of the response)
>         started = []
>
>         def detect_start_response(status, headers):
>             started.append(True)
>             return start_response(status, headers)
>
>         try:
>             app_iter = self.application(environ, detect_start_response)
>             return self.catching_iter(app_iter, environ)
>         except:
>             if not started:
>                 start_response('500 Internal Server Error',
>                                [('content-type', 'text/html')])
>             # @@: it would be nice to deal with bad content types here
>             dummy_file = StringIO()
>             response = self.exception_handler(sys.exc_info(), environ)
>             return [response]
>
>
>It really should capture the headers, and maybe buffer them itself (in 
>which case it would also have to intercept the writer), so that it can 
>deal more gracefully with a case where content type is set or 
>something.  But all that annoying stuff is better kept to this one piece 
>of middleware, instead of making everything more difficult with that extra 
>argument to start_response.

Um, the argument is *precisely* there so you don't need all of that! Try this:


def __call__(self, environ, start_response):
     try:
         app_iter = self.application(environ, detect_start_response)
     except:
         start_response('500 Internal Server Error',
                                [('content-type', 'text/html')], 
sys.exc_info())
         # etc...

No need to track the startedness here, because the upstream start_response 
will reraise the error if you've already started, thus breaking out of the 
middleware and getting back to the calling server.

From ianb at colorstudy.com  Wed Mar 30 21:21:38 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Mar 30 21:22:26 2005
Subject: [Web-SIG] WSGI start_response exc_info argument
In-Reply-To: <5.1.1.6.0.20050330141819.034371c0@mail.telecommunity.com>
References: <5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
	<5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
	<5.1.1.6.0.20050330141819.034371c0@mail.telecommunity.com>
Message-ID: <424AFC42.2070003@colorstudy.com>

Phillip J. Eby wrote:
>> It really should capture the headers, and maybe buffer them itself (in 
>> which case it would also have to intercept the writer), so that it can 
>> deal more gracefully with a case where content type is set or 
>> something.  But all that annoying stuff is better kept to this one 
>> piece of middleware, instead of making everything more difficult with 
>> that extra argument to start_response.
> 
> 
> Um, the argument is *precisely* there so you don't need all of that! Try 
> this:

But I don't mind all of that, because it is only contained in the error 
catching middleware and no where else.  I have other middleware that 
overrides start_response, and don't want to bother with all the exc_info 
in that case.  And a lot of the logic -- like trying to show errors even 
when there's been a partial response -- is just work, there's no way to 
get around it.

> 
> def __call__(self, environ, start_response):
>     try:
>         app_iter = self.application(environ, detect_start_response)
>     except:
>         start_response('500 Internal Server Error',
>                                [('content-type', 'text/html')], 
> sys.exc_info())
>         # etc...
> 
> No need to track the startedness here, because the upstream 
> start_response will reraise the error if you've already started, thus 
> breaking out of the middleware and getting back to the calling server.


-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From pje at telecommunity.com  Wed Mar 30 21:58:43 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Mar 30 22:03:06 2005
Subject: [Web-SIG] WSGI start_response exc_info argument
In-Reply-To: <424AFC42.2070003@colorstudy.com>
References: <5.1.1.6.0.20050330141819.034371c0@mail.telecommunity.com>
	<5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
	<5.1.1.6.0.20050330114245.03865400@mail.telecommunity.com>
	<5.1.1.6.0.20050330141819.034371c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050330145219.02942e20@mail.telecommunity.com>

At 01:21 PM 3/30/05 -0600, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>>It really should capture the headers, and maybe buffer them itself (in 
>>>which case it would also have to intercept the writer), so that it can 
>>>deal more gracefully with a case where content type is set or 
>>>something.  But all that annoying stuff is better kept to this one piece 
>>>of middleware, instead of making everything more difficult with that 
>>>extra argument to start_response.
>>
>>Um, the argument is *precisely* there so you don't need all of that! Try 
>>this:
>
>But I don't mind all of that, because it is only contained in the error 
>catching middleware and no where else.  I have other middleware that 
>overrides start_response, and don't want to bother with all the exc_info 
>in that case.

Just pass it through to the upstream start_response; the top-level server 
is the only one that needs to care.


>   And a lot of the logic -- like trying to show errors even when there's 
> been a partial response -- is just work, there's no way to get around it.

So leave it to the server.  All I'm saying is that there is no need to 
track whether the response has started.  It's the server's job to know 
that, and the opinion of middleware doesn't count here.  As long as the 
*server* hasn't sent the headers yet, you can restart the response.

Therefore, the correct way to send an error is for the error handler to 
pass exc_info to start_response, and middleware start_response() functions 
*must* pass that upward unless they definitely know better.  (E.g. because 
they're buffering and know the upstream start_response hasn't started yet.)

The point I'm trying to make here is that you seem to be trying to outsmart 
WSGI on this point; only the server is in a position to show an error in 
the case of a partial response, because it's the only component that 
definitively knows what has or hasn't been sent to the client.

From remi at cherrypy.org  Thu Mar 31 12:56:33 2005
From: remi at cherrypy.org (Remi Delon)
Date: Thu Mar 31 12:56:36 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <42498FF8.6030407@colorstudy.com>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
	<42498FF8.6030407@colorstudy.com>
Message-ID: <424BD761.6030202@cherrypy.org>

Ian Bicking wrote:
> Remi Delon wrote:
> 
>>> I'm wondering -- and this is mostly directed to the hosting providers 
>>> (Remi, Sean...) -- what are the problems with providing 
>>> commodity-level hosting for Python programs?  I can think of some, 
>>> but I'm curious what you've encountered and if you have ideas about 
>>> how to improve things.
>>>
>>> Some things I've thought about:
>>> * Long running processes are hard to maintain (assuming we rule out 
>>> CGI).  Code becomes stale, maybe the server process gets in a bad 
>>> state.   Sometimes processes becomes wedged.  With mod_python this 
>>> can effect the entire site.
>>
>>
>>
>> Yes, maintaining long-running processes can be a pain, but that's not 
>> related to python itself, it's true regardless of the language that 
>> was used to write the program.
>>
>>> * Isolating clients from each other can be difficult.  For mod_python 
>>> I'm assuming each client needs their own Apache server.
>>
>>
>>
>> Yes, that's how we ended up setting up our mod_python accounts.
>> We also found stability problems in some of the other mod_* modules 
>> (mod_webkit, mod_skunkweb, ...) and they sometimes crashed the main 
>> Apache server (very bad). So for all the frameworks that support a 
>> standalone HTTP server mode (CherryPy, Webware, Skunkweb, ...) we now 
>> set them up as standalone HTTP server listening on a local port, and 
>> we just use our main Apache server as a proxy to these servers.
>> This allows us to use the trick described on this page: 
>> http://www.cherrypy.org/wiki/BehindApache (look for "autostart.cgi") 
>> to have Apache restart the server automatically if it ever goes down.
> 
> 
> On our own servers we've been using CGI connectors (wkcgi, Zope.cgi), 
> which seem fast enough, and of course won't be crashing Apache.

Yeah, but we wanted a somewhat "standard" way of talking to Apache and
most frameworks do come with a small HTTP server, so that works fine for
us and it also completely isolates the process from Apache.

> Have you looked at Supervisor for long running processes?
>   http://www.plope.com/software/supervisor/
> I haven't had a chance to use it, but it looks useful for this sort of 
> thing.

Well, there are several such supervising tools (daemontools is another
one), but again, they never matched our exact needs. For instance,
sometimes it's OK if a process is down ... it could just be that the
user is working on his site. And also, they usually only watch one
thing: make sure that the process stays up, but there are a million
other things we wanted to watch for. So we just wrote our own scripts.


> HTTP does seem like a reasonable way to communicate between servers, 
> instead of all these ad hoc HTTP-like protocols (PCGI, SCGI, FastCGI, 
> mod_webkit, etc).  My only disappointment with that technique is that 
> you lose some context -- e.g., if REMOTE_USER is set, or 
> SCRIPT_NAME/PATH_INFO (you probably have to configure your URLs, since 
> they aren't detectable), mod_rewrite's additional environmental 
> variables, etc.  Hmm... I notice you use custom headers for that 
> (CP-Location), and I suppose other variables could also be passed 
> through... it's just unfortunate because that significantly adds to the 
> Apache configuration, which is something I try to avoid -- it's easy 
> enough to put in place, but hard to maintain.

The CP-Location trick is not needed (I should remove it from this page
as it confuses people).
Have a look at the section called "What are the drawbacks of running
CherryPy behind Apache ?" on this page:
http://www.cherrypy.org/wiki/CherryPyProductionSetup
It summarizes my view on this (basically, there aren't any real 
drawbacks if you're using mod_rewrite with Apache2).


>>>  Maybe this isn't as much of a problem these days, as virtualizing 
>>> technologies have improved, and multiple Apache processes isn't that 
>>> big of a deal.
>>> * Setup of frameworks is all over the place.  Setting up multiple 
>>> frameworks might be even more difficult.  Some of them may depend on 
>>> mod_rewrite.  Server processes are all over the place as well.
>>>
>>> But I don't have a real feeling for how to solve these, and I'm sure 
>>> there's things I'm not thinking about.
>>
>> Well, the 2 main problems that I can think of are:
>>     - Python frameworks tend to work as long-running processes, which 
>> have a lot of advantages for your site, but are a nightmare for 
>> hosting providers. There are soooo many things to watch for: CPU usage 
>> (a process can start "spinning"), RAM usage, process crashing, ... But 
>> that is not related to python and any hosting provider that supports 
>> long-running processes face the same challenge. For instance, we 
>> support Tomcat and the problems are the same. For this we ended up 
>> writing a lot of custom monitoring scripts on our own (we couldn't 
>> find exactly what we needed out there). Fortunately, python makes it 
>> easy to write these scripts :-)
> 
> 
> Do you do monitoring on a per-process basis (like a supervisor process) 
> or just globally scan through the processes and kill off any bad ones?

We monitor the general health of our servers on various levels and we
monitor the response time of some key sites/services on each of our 
servers to make sure that overall the server is OK.
For each individual site of our customers, we only have scripts that try
to restart the sites if they ever go down, but that's it (if the
customer changed their site and broke it, there isn't much we can do
about it).

> I've though that a forking server with a parent that monitored children 
> carefully would be nice, which would be kind of a per-process monitor. 
> It would mean I'd have to start thinking multiprocess, reversing all my 
> threaded habits, but I think I'm willing to do that in return for really 
> good reliability.

I'm still very much on the "thread pool" camp :-)
I've got CherryPy sites that run in a thread pool mode for months 
without any stability or memory leak problem.
If your process crashes or leaks memory then there's something wrong 
with your program in the first place, and the right way to solve it is 
not to switch to a multiprocess model.
Finally, if you want a monitoring process, it can be a completely 
separate process which allows you to still keep a "thread pool" model 
for your main process.

>>     - But another challenge (and this one is more specific to Python) 
>> is the number of python versions and third party modules that we have 
>> to support. For instance, at Python-Hosting.com, we have to support 
>> all 4 versions of python: 2.1, 2.2, 2.3 and 2.4, and all of them are 
>> being used by various people. And for each version, we usually have 10 
>> to 20 third-party modules (mysql-python, psycopg, elementtree, 
>> sqlobject, ...) that people need ! We run Red Hat Enterprise 3, but 
>> RPMs for python are not designed to work with multiple python versions 
>> installed, and RPMs for third-party modules are usually inexistent. As 
>> a result, we have to build all the python-related stuff from source. 
>> And some of these modules are sometimes hard to build (the 
>> python-subversion bindings for instance) and you can run into some 
>> library-version-compatibility nightmare. And as if this wasn't enough, 
>> new releases of modules come out everyday ...
> 
> 
> For the apps I've been deploying internally -- where we have both a more 
> controlled and less controlled environment than a commercial host -- 
> I've been installing every prerequesite in a per-application location, 
> i.e., ``python setup.py install --install-lib=app/stdlib``.  Python 
> module versioning issues are just too hard to resolve, and I'd rather 
> leave standard-packages with only really stable software that I don't 
> often need to update (like mxDateTime), and put everything else next to 
> the application.

Well, we have a mix of both: for all "more or less common" modules, we 
install them system-wide. If someone wants a really "esoteric" module 
that noone else on the server is likely to use, we usually tell them to 
install it in their home directory.

>> I think that this second point is the main challenge and any hosting 
>> provider that is not specialized in python doesn't have the time or 
>> the knowledge to build and maintain all these python versions and 
>> third-party modules. Of course, they could just say "we're going to 
>> support this specific python version with these few third-party 
>> modules and that's it", but experience shows that most people need at 
>> least one or 2 "uncommon" third-party modules for their site so if 
>> that module is missing they just can't run their site ...
> 
> Any reason for all the Python versions?  Well, I guess it's hard to ask 
> clients to upgrade.  If I was to support people in that way, I'd 
> probably try to standardize a Python version or two, and some core 
> modules (probably the ones that are harder to build, like database 
> drivers), and ask users to install everything else in their own 
> environment.  But of course when you are in service you have to do what 
> people want you to do...

Well, we very much decide what software/version we support based on 
customer demand ... If enough people want python 2.1, 2.2, 2.3 and 2.4 
(which is the case right now), then we support all of them ...
Recently there was a high demand for a commercial Trac/Subversion 
hosting with backups and HTTPS access, so we came up with such an offer 
and it turned out to be quite successful.

>> But above all, I think that the main reason why python frameworks are 
>> not more commonly supported by the big hosting providers is because 
>> the market for these frameworks is very small (apart from Zope/Plone). 
>> For all the "smaller" frameworks (CherryPy, Webware, SkunkWeb, 
>> Quixote, ...) we host less than 50 of each, so the big hosting 
>> providers simply won't bother learning these frameworks and supporting 
>> them for such a small market.
> 
> If they could support all of them at once, do you think it would be more 
> interesting to hosting providers?

Well, if all frameworks came in nicely packaged RPMs and they all 
integrated the same way with Apache (mod_wsgi anyone ?) I guess that 
would be a big step forward ... But you'd still have the problem of all 
the python third-party modules that people need ...

Remi.


From p.f.moore at gmail.com  Thu Mar 31 20:32:39 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Mar 31 20:32:44 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <424BD761.6030202@cherrypy.org>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
	<42498FF8.6030407@colorstudy.com> <424BD761.6030202@cherrypy.org>
Message-ID: <79990c6b050331103239c0cc6b@mail.gmail.com>

On Thu, 31 Mar 2005 11:56:33 +0100, Remi Delon <remi@cherrypy.org> wrote:

> The CP-Location trick is not needed (I should remove it from this page
> as it confuses people).

Hmm, I wrote that part of the page. My specific reason for using a
custom header is that it's the only way I can see to locate a CherryPy
application *not* at the root of a virtual host.

For example, if I have a CherryPy application exposed at (say
http://myserver/myapp/) then how do I write absolute URL paths (like
"/login" within my app? As far as I can tell, there's nothing
available to my application that says "/myapp/". Yes, I can hard code
it in the application, but I'd rather keep the value in one place -
it's necessary in the Apache config, so I'd like to pass it from there
to the app.

If I've missed something that I could have used, though, I'd like to
know (and I'll update the page appropriately).

Thanks,
Paul.
From ianb at colorstudy.com  Thu Mar 31 21:25:18 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu Mar 31 21:26:31 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <424BD761.6030202@cherrypy.org>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
	<42498FF8.6030407@colorstudy.com> <424BD761.6030202@cherrypy.org>
Message-ID: <424C4E9E.8010009@colorstudy.com>

Remi Delon wrote:
>> On our own servers we've been using CGI connectors (wkcgi, Zope.cgi), 
>> which seem fast enough, and of course won't be crashing Apache.
> 
> 
> Yeah, but we wanted a somewhat "standard" way of talking to Apache and
> most frameworks do come with a small HTTP server, so that works fine for
> us and it also completely isolates the process from Apache.

CGI is pretty standard, isn't it?  I think of the adapters as little 
pieces of the frameworks themselves.  Or just a simpler, more isolated 
alternative to mod_*.

>> Have you looked at Supervisor for long running processes?
>>   http://www.plope.com/software/supervisor/
>> I haven't had a chance to use it, but it looks useful for this sort of 
>> thing.
> 
> 
> Well, there are several such supervising tools (daemontools is another
> one), but again, they never matched our exact needs. For instance,
> sometimes it's OK if a process is down ... it could just be that the
> user is working on his site. And also, they usually only watch one
> thing: make sure that the process stays up, but there are a million
> other things we wanted to watch for. So we just wrote our own scripts.

Unlike daemontools, Supervisor is written in Python, which makes it good 
  ;)  It also seems like it's meant ot address just the kind of 
situation you're in -- allowing users to restart servers despite having 
different permissions, monitoring servers, etc.

>> HTTP does seem like a reasonable way to communicate between servers, 
>> instead of all these ad hoc HTTP-like protocols (PCGI, SCGI, FastCGI, 
>> mod_webkit, etc).  My only disappointment with that technique is that 
>> you lose some context -- e.g., if REMOTE_USER is set, or 
>> SCRIPT_NAME/PATH_INFO (you probably have to configure your URLs, since 
>> they aren't detectable), mod_rewrite's additional environmental 
>> variables, etc.  Hmm... I notice you use custom headers for that 
>> (CP-Location), and I suppose other variables could also be passed 
>> through... it's just unfortunate because that significantly adds to 
>> the Apache configuration, which is something I try to avoid -- it's 
>> easy enough to put in place, but hard to maintain.
> 
> 
> The CP-Location trick is not needed (I should remove it from this page
> as it confuses people).
> Have a look at the section called "What are the drawbacks of running
> CherryPy behind Apache ?" on this page:
> http://www.cherrypy.org/wiki/CherryPyProductionSetup
> It summarizes my view on this (basically, there aren't any real 
> drawbacks if you're using mod_rewrite with Apache2).

Does Apache 2 add a X-Original-URI header or something?  I see the 
Forwarded-For and Forwarded-Host headers, but those are only part of the 
request -- leaving out REMOTE_USER, SCRIPT_NAME, PATH_INFO, and maybe 
some other internal variables.

>> I've though that a forking server with a parent that monitored 
>> children carefully would be nice, which would be kind of a per-process 
>> monitor. It would mean I'd have to start thinking multiprocess, 
>> reversing all my threaded habits, but I think I'm willing to do that 
>> in return for really good reliability.
> 
> 
> I'm still very much on the "thread pool" camp :-)
> I've got CherryPy sites that run in a thread pool mode for months 
> without any stability or memory leak problem.
> If your process crashes or leaks memory then there's something wrong 
> with your program in the first place, and the right way to solve it is 
> not to switch to a multiprocess model.
> Finally, if you want a monitoring process, it can be a completely 
> separate process which allows you to still keep a "thread pool" model 
> for your main process.

That's true -- cleanly killing a threaded app can be hard, though, at 
least in my experience.  The other issue I worry about is scaling down 
while still having separation -- like if I want a simple form->mail 
script, how do I deploy that?  A separate threaded process is really 
heavyweight.  But is it a good idea to put it in a process shared with 
another application?  This is what leads me in the direction of multiple 
processes, even though I've been using thread pools for most of my 
applications in the past without problem.

>>> But above all, I think that the main reason why python frameworks are 
>>> not more commonly supported by the big hosting providers is because 
>>> the market for these frameworks is very small (apart from 
>>> Zope/Plone). For all the "smaller" frameworks (CherryPy, Webware, 
>>> SkunkWeb, Quixote, ...) we host less than 50 of each, so the big 
>>> hosting providers simply won't bother learning these frameworks and 
>>> supporting them for such a small market.
>>
>>
>> If they could support all of them at once, do you think it would be 
>> more interesting to hosting providers?
> 
> 
> Well, if all frameworks came in nicely packaged RPMs and they all 
> integrated the same way with Apache (mod_wsgi anyone ?) I guess that 
> would be a big step forward ... But you'd still have the problem of all 
> the python third-party modules that people need ...

Would mod_scgi be sufficient?  It's essentially equivalent to 
mod_webkit, mod_skunkweb, and PCGI, while avoiding the hassle of 
FastCGI.  In theory FastCGI is the way to do all of this, but despite my 
best efforts I can never get it to work.  Well "best efforts" might 
indicate more work than I've actually put into it, but enough effort to 
leave me thoroughly annoyed ;)

-- 
Ian Bicking  /  ianb@colorstudy.com  /  http://blog.ianbicking.org
From titus at caltech.edu  Thu Mar 31 21:33:59 2005
From: titus at caltech.edu (Titus Brown)
Date: Thu Mar 31 21:34:02 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <79990c6b050331103239c0cc6b@mail.gmail.com>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
	<42498FF8.6030407@colorstudy.com> <424BD761.6030202@cherrypy.org>
	<79990c6b050331103239c0cc6b@mail.gmail.com>
Message-ID: <20050331193358.GA32357@caltech.edu>

-> > The CP-Location trick is not needed (I should remove it from this page
-> > as it confuses people).
-> 
-> Hmm, I wrote that part of the page. My specific reason for using a
-> custom header is that it's the only way I can see to locate a CherryPy
-> application *not* at the root of a virtual host.
-> 
-> For example, if I have a CherryPy application exposed at (say
-> http://myserver/myapp/) then how do I write absolute URL paths (like
-> "/login" within my app? As far as I can tell, there's nothing
-> available to my application that says "/myapp/". Yes, I can hard code
-> it in the application, but I'd rather keep the value in one place -
-> it's necessary in the Apache config, so I'd like to pass it from there
-> to the app.
-> 
-> If I've missed something that I could have used, though, I'd like to
-> know (and I'll update the page appropriately).

Doesn't this preclude redeploying your WSGI apps under different
drivers, e.g. CGI vs SCGI vs...?  I tend to do things like run my
Quixote applications under '/url' and '/~t/url.cgi' -- the former
for deployment, the latter for testing and development.  You would
have to have a different Apache rewrite rule for each case, no?

I use relative URLs because of this, but I understand why this might
cause you problems.  Perhaps we can add a 'root namespace' parameter
to WSGI...

cheers,
--titus
From titus at caltech.edu  Thu Mar 31 21:36:06 2005
From: titus at caltech.edu (Titus Brown)
Date: Thu Mar 31 21:36:11 2005
Subject: [Web-SIG] A query for hosting providers
In-Reply-To: <20050331193358.GA32357@caltech.edu>
References: <42473870.6070502@colorstudy.com> <4249235B.2070707@cherrypy.org>
	<42498FF8.6030407@colorstudy.com> <424BD761.6030202@cherrypy.org>
	<79990c6b050331103239c0cc6b@mail.gmail.com>
	<20050331193358.GA32357@caltech.edu>
Message-ID: <20050331193606.GC32357@caltech.edu>

-> I use relative URLs because of this, but I understand why this might
-> cause you problems.  Perhaps we can add a 'root namespace' parameter
-> to WSGI...

duh.

That's what SCRIPT_NAME is.

Sorry.

;)

--titus