[Email-SIG] API thoughts

Thu Mar 3 02:23:41 CET 2011

On Wed, 02 Mar 2011 15:46:24 -0500, Barry Warsaw <barry at python.org> wrote:
> On Mar 01, 2011, at 03:40 PM, R. David Murray wrote:
> >So, I think the "policy framework" is actually two things:  the
> >header/mime-types registry, and the Parser/Generator policies.  Let's have
> >'policy' refer to only the I/O policy, and call the other the email
> >class registry.
> 
> +1
> 
> This makes a lot of sense, and I'm glad you've been thinking about this more
> deeply than I have since we last bandied it about.  At the time, I thought a
> single policy hierarchy would probably be fine, but you've laid out a good
> argument for keeping them separate, and in fact not even calling the latter
> a 'policy'.  Here's another distinction:
> 
> Policy objects should be composable.  This would allow for a standard library
> of policies that could be mixed and matched for specific applications, and
> might even include some higher level policies like 'CGI' or 'NNTP'.  E.g. my
> applications might combine a standard 'don't-check-rfc-2047' policy with a
> 'use-only-CRNL' and 'die-on-defect'.

Yes, my current implementation of policy objects allows you to say
things like:

    policy = HTTP + Strict

where HTTP is the obvious and 'Strict' is a policy that sets the "raise
on defect" flag.

> I wonder too, how sophisticated policy objects really need to be.  Are they
> just bags of attributes with some defaults, properties for access, maybe some
> validation, and composability?

Pretty much.  I think they will also contain some callable methods,
to provide hooks where a policy subclass can implement a custom policy.
My current implementation has such a hook for registering defects, which
would allow a custom policy to, for example, log the defects in addition
to or instead of putting them into the defects list.

> As for the registry, I don't think you need anything near that.  You just need
> to say "when you see this mime-type, create an object using this callable".
> Multiple registrations might be useful, but I don't think composability is.

Well, I'm thinking that a minimal sort of composability *is* useful.
One of the annoying things about class hierarchies is that if you want to
add a feature to the base class, you have to make new subclasses for *all*
of the classes in the hierarchy (unless you monkey patch).  What I was
thinking of was to have the registry have a 'base class' slot that got
used as the base class for all the mime-type classes, composed on the fly
at instantiation time (and similarly for the headers).  That way if you
wanted to add features to all the classes in the hierarchy, you could
register your custom 'base class' and not need to touch anything else.
But since the API for the registry is now a callable, and especially if
we specify it as returning callables, then doing such composition could
be left to the application (perhaps with a recipe in the docs).

Composing registries can thus also be left to the application.  email6
itself should have only one, I think, or if there are two the other will
be the email5 back-compat registry and there'd be no reason to compose
with it.

I'm not sure what we you mean by multiple registrations.  Can you give
an example?

> >The real meat of email6, then, is the header/mime-types registry, and
> >the changes in the API of the resulting Message objects.  The parser
> >currently accepts a _factory argument that specifies the object to be used
> >in creating the Message.   I propose that we deprecate this argument,
> >but that any code using it gets the old behavior of the parser (using
> >_factory to create the class for any new sub-objects).  Then we introduce
> >a new argument, 'factory'.  This new argument would expect a callable
> >that takes a mime-type as its argument, and returns an appropriate class.
> >The parser would be re-written so that it could use this factory, and
> >the backward compatibility case would be trivial to implement.
> 
> +1.  The underscore name in _factory is a historical wart that's not needed
> any more.  I'm not even sure it makes much sense any more in Message
> subclasses.  It *does* still make sense in e.g. add_header() where there's a
> potential name collision between the arguments and the **params.  We should
> evaluate these more carefully given today's API and clean this up if possible
> (modulo all b/c considerations).

Ah, so *that's* what those underscores are for.  I always wondered.
Yeah, I think we can do a lot of cleanup here.

> Cool.  Really great stuff David.

Thanks.

--David