[Web-SIG] [ANN] Aspen 0.5 -- module reloading & directory handlers

Fri Dec 8 22:21:38 CET 2006

Chad Whitacre wrote:
> René,
> 
>> Why did you make it?  Just for fun?  Or is there some other reason you
>> chose to make it?
> 
> Thanks for biting. :-)
> 
> Not just for fun, no. I maintain about 30 websites, implemented 
> in a mix of server technologies:
> 
>    - Apache (static HTML, CGI, PHP)
>    - Zope 2 (with and without Plone)
>    - httpyd (Aspen fore-runner)
> 
> My goal with Aspen is to shove all of this heterogeneity into the 
> websites themselves, and use a single server for all of them. I 
> named it Aspen because a grove of aspen trees all share a common 
> root structure (ranking certain aspen groves among the world's 
> largest living things!).
> 
> As I mentioned earlier, I think Aspen shares this goal with Paste 
> Script/Deploy, but Aspen is more filesystem-centric, it's 
> intended for production use, and (IMO) it's simpler.

Of course Paste is intended for production use!  But maybe you mean 
paste.httpserver -- which is used some in production cases, but isn't 
entirely intended as such.  Not that we'd turn down patches to improve 
it, but serious testing in typical production situations hasn't really 
been done.

As for simplicity, well, I dunno but it depends a lot on what you are 
talking about.  Of course Paste is many things, but those things are 
loosely bound.

> Here are some design considerations:
> 
>    - I want to use a single web server for many heterogeneous
>      websites, from development through to production.

Paste is essentially web server agnostic.  paste.httpserver is often 
useful in development, but it's really no more bound to Paste than 
anything else.  *All* paste.httpserver does is HTTP, which is important 
IMHO -- by keeping its functionality at a minimum it is more feasible to 
choose other options when there's a reason to do so.  If Aspen sought to 
provide a more production-read HTTP server then that'd be great.

>    - A website should "look like a website" on the filesystem. Any
>      project-y directory structure should be swept under the rug.
> 
>    - At the same time, websites should be self-contained, with all
>      packages and configuration together in one place.

I personally deal with that at the Python level, with tools like 
workingenv and virtual-python.  Anyone doing serious web development 
should definitely be using tools like that, and not installing stuff in 
their system Python.

>    - I should be able to type "aspen" in any directory and have
>      something smart happen.

I really can't predict what that would do...?  That doesn't seem simple 
to me.  It starts up a web server serving the current directory, I 
guess, but with what configuration?  What port, what logging, etc?

>    - Configuration syntaxes should be simple and stupid. Python
>      especially is waaay overkill as a configuration language.

Paste Deploy tries to keep configuration fairly simple.

> If Aspen were written in C you could compare it to Mongrel.
> 
> I'd be curious to know what you use to build and deploy your WSGI 
> websites. Do you use Paste Script/Deploy, or something else?

Right now I'm working in a different kind of development from you (more 
one big website), but previously I was probably in a very similar 
situation with lots of different clients sharing a machine but not 
sharing websites, with some shared applications and code but a lot of 
client-specific code too.

I guess I never actually described what we did there, but it took a 
while to find a good technique and it wasn't really finished before I left.

Because our sites had all been started in Apache (using SSIs for 
templating) we had an Apache frontend and often lots of complex config 
built up over time.  Each site had its own Apache instance and 
configuration (we didn't do virtual hosting).

To incorporate dynamic content we used rewrite rules and later 
ScriptAlias to send certain URLs to an external app server.  We used 
SCGI via cgi2scgi, and we just passed the full URL.

 From there we had a map from URLs to config files.  It looks like:

   [app:main]
   use = egg:Paste#urlmap
   /app1 = config:app1.ini
   /app2 = config:app2.ini (and so on)
   /some/other/app = config:other_app.ini

The name (e.g., "app1") showed up in a lot of places, and while we could 
change the name (e.g., /app1 = config:some_app.ini) we never did.  I 
think you filesystem layout could simplify that some.  But we often had 
reasons to point deep directories to some application point without 
handling the intermediate directories through the app server.

The config files then point to a WSGI application, which is probably 
similar to a handler in Aspen.  Usually there was only one instance of 
the application in a site; we could do more, but that was only used in a 
couple circumstances (though we were moving more in that direction as we 
reconsidered our expectations of what an application could do).  A 
really simple one, for instance, was something that picked a random URL 
from a text file (usually for image rotation):

   [app:main]
   use = egg:Randomize
   file = /path/to/file.txt

We'd put all the instances in one config file, with different sections 
for each case.

The static files were always kept very separate from the code files. 
Sometimes our code would write to static files, though usually not.

For templating we had a search path for templates.  The search path 
generally looked like:

   webhome/templates/app_name/template_name.pt
   webhome/templates/template_name.pt
   apphome/templates/template_name.pt

This let us distribute default templates with each app, with 
app-template or site-template overrides.  We used this a lot for 
client-specific tweaks.  Any other tweaks we'd typically facilitate 
through adding configuration options; we tried not to branch the code 
for very long.  Again, templates never got mixed with static content, 
and client-specific templates never got mixed with the generic 
application.  Everything went into version control, but in different trees.

While putting code in the site directly doesn't seem *horrible* to me, 
once we all learned the techniques of indirection I think keeping code 
separate from data (and data separate from templates) was extremely 
helpful.  We were probably a bigger team than what you're in, and so we 
had people who would only touch static content and templates and never 
code, or people who would only access static content through management 
tools.  So in part our layout reflected the way we split up jobs between 
people.

If I were going to do a filesystem-oriented layout, I'd want to at least 
have a parent directory of the site and not entirely mix static and 
dynamic content.  If nothing else it makes the svn layout easier to 
manage when sharing code (and if you aren't using version control, you 
are of course nuts ;).  Then I'd probably allow a small number of 
special files to cause different kinds of indirection, very similar to 
.htaccess (but of course much nicer than htaccess).

For instance, what if I wanted to allow people to upload files to a 
directory, but didn't trust the people?  I wouldn't want to put any file 
in that directory to indicate this, because I'm afraid someone would 
figure out a way to overwrite that file.  Given a totally representative 
filesystem layout this seems hard to implement.  With just a little 
indirection it's much more feasible.

That said, it would have been awesome to be able to just drop files in 
trusted locations and have functionality just appear.  We never really 
pursued this because we were still using Apache, though I guess if we 
were clever we could have figured it out with some rewrite rules or with 
AddHandler.  Keeping the extension out of the URL would have been 
problematic, though, and I hate code-specific extensions.  (Resource 
specific extensions don't bother me at all, though, like .html).

Hrm... I didn't really mean to go into all that description.  But there 
it is.  I probably should write it up for real sometime.

> Thanks again for your questions.
> 
> 
> 
> chad
> 
> P.S. Paste includes a lot. I actually spent quite a while today 
> becoming more familiar with all of Paste's middleware, and I'm 
> pretty excited about that part of it. PonyMiddleware! :)

And now you are even a contributor to that middleware ;)

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org