[Distutils] Object loading spec
Ian Bicking
ianb at colorstudy.com
Tue Oct 30 17:24:34 CET 2007
I mentioned some time ago that it would be nice if everyone could agree
how to configure object references. Mostly I just had in mind having
Paste Deploy and zc.buildout use the same syntax, but it's something
that comes up in ad hoc situations often as well.
I wrote up a small implementation to push this idea forward a bit. I'm
hoping to get some feedback here. In an unfortunate mishmash of
syllables (name suggestions welcome) I named it obconfloader. You can
read the description here:
http://svn.pythonpaste.org/ObConfLoader/trunk/docs/index.txt
I've also copied that description into this email.
Status & License
----------------
ObConfLoader is under an `MIT-style permissive license
<http://svn.pythonpaste.org/ObConfLoader/trunk/docs/license.txt>`_.
Discussion should occur on `distutils-sig
<http://www.python.org/community/sigs/current/distutils-sig/>`_;
please be sure to put "ObConfLoader" in the subject line.
The package is available in a `subversion repository
<http://svn.pythonpaste.org/ObConfLoader/trunk#egg=ObConfLoader>`_,
and the trunk can be installed with ``easy_install
ObConfLoader==dev``. You can get a checkout with::
svn co http://svn.pythonpaste.org/ObConfLoader/trunk ObConfLoader
Introduction
------------
ObConfLoader allows you to load objects from strings, typically for
use in config files (or from the command line, or other locations
where object references are necessary but the format is constrained).
It also allows you to utilize `Setuptools entry points
<http://peak.telecommunity.com/DevCenter/setuptools#entry-points>`_ to
make public objects that can be referenced. The concept of an entry
point group, or API, allows backward compatibility when using entry
points.
The basic usage is very simple::
from obconfloader import load_ob
obj, ep_group = load_ob(a_string, ['mypackage.api'])
assert ep_group is None or ep_group == 'mypackage.api'
This loads the object from the given string, returning the object and
the API it supports. The second argument is a list of entry point
groups that you support.
You can support multiple entry point groups, or no groups (by leaving
out the second argument, or using None). If you leave out the second
argument ``ep_group`` will always be None.
There are several possible exceptions:
``LoadError``:
The parent of all other exceptions; a generic error.
``ConfigSyntaxError``:
The string itself is malformed.
``BadGroupError``:
The group given doesn't match up with an acceptable input group.
``NotFoundError``:
The referenced object can't be found (something like
DistributionNotFound, ImportError, AttributeError).
``PythonError``:
Some error with a Python expression, Python syntax, etc.
Specification Format
--------------------
There are several formats for strings:
``file /path/to/file.py``
This loads the given file, and returns the module object.
``file /path/to/file.py:dotted.name``
This loads the given file, and returns the object ``dotted.name``
inside it.
``file /path/to/file.py:dotted.name [entry.point.group]``
This loads the object, and indicates that the object supports the
group ``entry.point.group``. This can be used when multiple APIs
are supported, and the object doesn't use the default API.
``python module.name:dotted.name``
This loads the module ``module.name``, and returns the
``dotted.name`` object. Of course you can leave out
``dotted.name`` to return the module, and also add an entry point
group.
``egg Distribution``
This loads the distribution (the package) ``Distribution``, and
gets the ``main`` entry point, with the given entry point group.
``egg Distribution:entry_point_name``
Instead of the ``main`` entry point, this gets
``entry_point_name``.
Other Schemes
-------------
You can pass in a dictionary of extra schemes (besides ``file``,
``python``, and ``egg``) with the ``extra_schemes`` argument. All
schemes are converted to lower case. The signature of a loader is::
def scheme_loader(string, groups, orig_string, position, group):
if problem:
raise LoadError("message", orig_string, position)
return loaded_obj, group
The meaning of the arguments:
``string``:
The string you should load, like
``"/path/to/file.py:dotted.name"``. This leaves out the trailing
group and the scheme.
``groups``:
The entry point groups accepted. Schemes besides ``egg`` probably
will ignore this.
``orig_string``:
The full string, with scheme. Used primarily with any
``LoadError`` exceptions, to give the user back the full string
they entered.
``position``:
The position where the string was found; also used with
exceptions.
``group``:
If the string contained an explicit group, this will be that group
(otherwise None). If you don't have any explicit notion of groups
then you can just do the return as in the example.
You should raise LoadError exceptions when at all possible; failures
generally *are* expected.
Open Issues
-----------
* Just using a different data source (like a database) for Python code
instead of a file isn't super-easy (you'd have to provide some kind
of extra scheme). Maybe a little refactoring of ``load_file`` would
make it easier.
* Possibly ``load_file`` itself should have some resource-finding
pluggability. This could be used to restrict the files that can be
accessed.
* Disabling a scheme by setting ``extra_schemes=dict(file=None)``
would be handy, but isn't clean now.
* The form ``python module.name.object.name`` *looks* reasonable, but it
has to be ``python module.name:object.name``. The error message
isn't very helpful either (basically an import error). Probably it
should Just Work.
* I don't scan the system for extra kinds of schemes. No
pluggability. I am shying away from doing complete entry point
scans (e.g., with ``pkg_resources.iter_entry_points``) because of
efficiency concerns.
* There's no way to indicate an egg entry point name without
distribution. Maybe ``egg *:ep_name`` should be allowed? This is
inefficient, but the ``*`` makes it *look* inefficient too.
* Should an implied scheme be allowed? E.g., something like Buffet
(a template abstraction layer) basically needs an object reference
to the renderer. Right now this *must* be an egg/entry point, but
there's no reason it couldn't use other kinds of references. But
the ideal specification would just be ``Distribution``, not ``egg
Distribution``. A default scheme of egg would fix this.
* There's nothing for non-Python-object resources, like files or
templates. This is probably fine.
* There's no built-in indirection. Ideally such indirection would
happen in the config file or elsewhere in the system.
* Some exceptions are swallowed, though an attempt is made to swallow
only the most boring exceptions.
* There are some FIXME's in the code.
More information about the Distutils-SIG
mailing list