[Web-SIG] PasteDeploy 0.1

Phillip J. Eby pje at telecommunity.com
Wed Aug 24 01:32:39 CEST 2005


At 05:27 PM 8/23/2005 -0500, Ian Bicking wrote:
>So services (aka components) are just a objects with .get_service(key) 
>methods?  Is there any other API or semantics implied?

Not at the handwavy level we're currently discussing them with, no.


>>No. The globalconfigservice *becomes* the parent_component of the 
>>components that follow it, until another non-wrapper component is defined 
>>(which then becomes the parent of those that follow it, and so on).
>
>Does the configuration somehow indicate that something produces a 
>component, as opposed to producing the object-in-question (WSGI 
>application for us)?  I'm not clear how an application, an application 
>wrapper, and a component wrapper are distinguished.

In the syntax I've been using to date, "wrapper" simply indicates that the 
component wishes to receive the components following it as an argument, 
replacing them with the wrapper's return value.  All non-wrappers are just 
components.

As I've been thinking through the implementation some more, I've realized 
that the "wrapper" keyword isn't really needed, if the construction 
responsibilities are divided a bit differently than I first had in 
mind.  More on that in a later post.


>>I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', 
>>since these two syntaxes can cover everything you or I have thus far imagined.
>
>What exactly do you envision for config URLs?

In the simple case, they should just be relative URLs.


>Another case where I would use positional parameters to do something 
>different would be a cascading dispatcher, like:
>
>main is cascade from Paste:
>     static from Paste:
>         document_root = "/..."
>     blog from MyBlog:
>         ...
>     catch = 404

This syntax is ambiguous at 1-token lookahead, because you can't tell up 
front whether "static" is supposed to be a name that's being assigned (and 
therefore followed by "is" or "="), or whether it's a factory name (in 
which case it may be followed by "." and more identifiers, possibly 
followed by "from").

There might be a way to disambiguate it by complicating the grammar, I 
suppose, but I'm not sure I like it.  The way I've currently conceived of 
the grammar is that you can have either assignment (namespace) scopes or 
sequence scopes.  In my way of thinking, the top-level is a sequence scope, 
and everything else is a namespace scope, unless you introduce a sequence 
scope using "is:".  Thus, I see your example above as simply beginning 
"main is:", and then the contents can be a sequence.


>I think we disagree about one-app-per-file, and perhaps you also have a 
>notion that doesn't come out in all of your examples that you want a stack 
>represented at the top-level of the file...?  That is, like:
>
>   auth from Paste:
>     ...
>   # wraps...
>   session from Session:
>     ...
>   # wraps
>   main from MyApp:
>     ...
>
>
>If that's what you are getting at, I *really* don't like that.  Config 
>files don't use top-level ordering often at all.

That depends quite a lot on what the configuration file does, and its format.

However, if you would like to make it not be that way, all you have to do is:

     main from:
         # named stuff here

My reasoning for this is as follows.  In the simplest possible case, a user 
should be able to deploy an application using only this, as their entire file:

     app from SomeCoolApp

In other words, the above is the "hello world" of this language.  Your 
variation would be:

     main is app from SomeCoolApp

Not a lot of difference at this initial level, but now let's add a 
filter.  My way:

     login from Paste
     app from SomeCoolApp

Your way:

     main is:
         login from Paste
         app from SomeCoolApp

The big difference between your take and my take on this is that I'm 
viewing a file as specifying an object, while you're viewing it as defining 
a namespace of objects.


>   The few cases where order matters, it's purely as priority for 
> overlapping options (like rewrite rules).  And those few cases suck 
> anyway because of the ambiguity of overlap, so it's kind of the exception 
> that proves the rule.

But pipelines are sequences too.


>That's nothing but stupid boilerplate, because otherwise you can't get at 
>that function if you put everything in the "if" statement.  In the same 
>way, I want to be able to  be able to pick pieces out of a configuration 
>file without creating the main application, and I want to be able to look 
>in the main application without creating it (since it's mostly opaque once 
>it's been created).

You're making the assumption that what you "get" is the created object, 
while I'm assuming that what you get is a partially-applied factory, with 
properties that return configuration values or other factories.  You still 
have to call the factory to create the objects.

IOW, the way I see it is that you parse a configuration file by providing 
some scope-and-context information, and you get a factory object back.  If 
the factory object is a namespace, then you can access its properties to 
get values or child factories.  So, to create a library configuration file, 
I'd assume something like:

     some_factory:
         foo is blah:
             ...
         bar is feh:
             ...

What 'some_factory' actually creates is unimportant if it never gets 
called, and if you're just pulling pieces out of it in another 
configuration file, it won't get called.


>__main__ is completely unnecessary, as "main" seems quite special on its 
>own without scary underscores.  It's a very natural name, and one that 
>should be intuitive to anyone reading the file.  That it has a name shows 
>that it is a distinct entity, but a series of unnamed entries in the 
>config file doesn't imply that in the same way.

Yeah, it's just that it seems weird to me to have URLs represent namespaces 
that contain objects, but not be able to have URLs refer to objects!  That 
seems downright strange.

It also seems to me that the common case will be to define a single 
pipeline in a file (often with just a single component!), and that making 
the library developer's job easier (by avoiding the 'some_factory:' wrapper 
at the top level) makes the deployer's job harder (by requiring a "main 
is:" wrapper).

That pretty much seems like the tradeoff; either the multi-config developer 
has to do an extra indent, or else the deployer does.  My inclination is to 
favor the deployer.


>Maybe these don't have to be string literals.

They do if we want to keep it compatible with Python's tokenizer, and I 
definitely want that.  For one thing, it potentially allows implementing a 
pgen-based C parser for this.

Speaking of parsers, here's my current idea of the grammar:

   sequence ::= object+
   object   ::= qname source? (suite | NEWLINE)
   source   ::= "from" (STRING | project)?
   suite    ::= ":" INDENT assign+ DEDENT
   assign   ::= (NAME | STRING) ( ("=" testlist NEWLINE) | ("is" objects) )
   objects  ::= object | ":" INDENT sequence DEDENT
   qname    ::= NAME ("." NAME)*

   project  ::= NAME ("-" NAME)* versions? extras?
   versions ::= cmpop version ("," cmpop version)* ","?
   version  ::= INT | FLOAT | STRING    # maybe just string?
   cmpop    ::= "<" | "<=" | "==" | "!=" | ">=" | ">"
   extras   ::= "[" NAME ("," NAME)* ","? "]"

As you can see, the core syntax is just seven productions, not counting the 
five for egg project requirements and the "testlist" productions from the 
Python expression grammar.  So, it's pretty darn simple as languages go.

My rough concept of the semantics is that suites represent functions, and 
definitions are a cross between setting function attributes on the function 
defined by the enclosing suite, and setting a default value for a keyword 
argument within that enclosing function.  i.e.:

    foo:
        bar is baz:
           spam = 23

is roughly equivalent to:

    def __main__(**kw):
        kw.setdefault('bar', __main__.bar())

    def bar(**kw):
        kw.setdefault('spam', 23)
        return baz(**kw)

    bar.spam = 23

    __main__.bar = bar

For sequences of definitions, you get a function whose attributes come from 
the namespace of the last suite in the sequence.

This is all *rough* semantics, mind you; it will almost certainly *not* be 
implemented using Python functions, because of the need to manage many 
levels of nested scopes, and the calling signatures won't exactly match 
this either.  I'm just giving this "as functions" sketch to give an idea of 
why the whole thing can readily be introspected as data if you want it to be.



More information about the Web-SIG mailing list