Mailman 3 [Twisted-Python] Twisted Plugins - Implementation Discussion - Twisted

newer
[Twisted-Python] Weekly Bug Summary

[Twisted-Python] Twisted Plugins - Implementation Discussion

Stephen Thorne

7 Apr 2011 7 Apr '11

12:35 a.m.

From these 19 lines of code there are 4 things that are relevent: tapname = example description = 'Example Twistd Plugin'

G'day, So Glyph and I had a discussion about the architecture and implementation of plugins on IRC this week, I raised some issues that I've seen with implementing plugins in that discussion, and he said that I should take the discussion to the list because IRC wasn't the right place for it. First of all, a quick discussion of the current plugin architecture, so that we're on the same page. 'twistd' automatically imports python modules from under twisted/plugins relative to sys.path[1], or it loads a cache of those plugins from dropin.cache, or it loads a cache of those plugins from dropin.cache. The reason it loads all of those plugins is so that 'tapname' and 'description' can be grabbed out of all of the serviceMaker attributes of all those modules. Then running 'twistd' shows a helpful list of commands, and 'twistd $tapname' uses the correct serviceMaker to start whatever service is specified. Part of the discussion was about how to rewrite this in such a way that no python code needs to be run in order to discover all the tapname+description combinations that are available to twistd, this is because of a perceived performance and sanity deficit in using 'twistd'. In the course of the discussion I raised several things that I consider annoyances in the twisted plugin system. I will repeat them here. First, the reason i use twisted plugins is because they're the a way to easily do sensible things with logging, daemonisation and interaction from init.d files. * The number of imports required to compose a plugin is annoying. 2 interfaces from two different packages, plus needing zope.interface.implements. * I've never liked the twisted arg parser, I use it only grudingly, it would be nice to be able to throw argv at my make_service call. * The entire task of having this python plugin is to link up the metadata with a more or less standard * It's very easy to accidentally make your plugin load your package for every other twistd daemon running out of the same plugin cache. * The practice of putting a module under twisted/plugins/$mymodulehere.py upsets my equilibrium, the only reason I tolerate this kind of thing is that I install my python code via rpms and I automatically get installation, ownership and uninstallation done in a stable manner. For your consideration, and (constructive) critcism, here is a twisted plugin that is nearly identical to 6 that I have running in production: from zope.interface import implements from twisted.python import usage from twisted.plugin import IPlugin from twisted.application.service import IServiceMaker class Options(usage.Options): optFlags = [['debug', 'd', 'Emit debug messages']] class ExampleServiceMaker(object): implements(IServiceMaker, IPlugin) tapname = 'example' description = 'Example Twistd Plugin' options = Options def makeService(self, options): from examplepackage.examplemodule import make_service return make_service(debug=options['debug']) serviceMaker = ExampleServiceMaker() options = {'debug':True} if args == ['--debug'] else {'debug':False} service = examplepackage.examplemodule.make_service(**options) Only 2 of which are relevent for running a twistd daemon other than 'example' So the goal of my post to this mailing list is: * I would like glyph's goal of having less arbitary code executed at twistd launch time to become a realisation, * I would like the process of creating a twisted plugin to be less of a cut+paste+fill-in-blanks hassle. -- Regards, Stephen Thorne Development Engineer Netbox Blue

Show replies by date

Phil Christensen

7 Apr 7 Apr

4:06 a.m.

On Apr 6, 2011, at 8:35 PM, Stephen Thorne wrote:

...

For your consideration, and (constructive) critcism, here is a twisted plugin that is nearly identical to 6 that I have running in production: [snip] serviceMaker = ExampleServiceMaker()

From these 19 lines of code there are 4 things that are relevent: tapname = example description = 'Example Twistd Plugin' options = {'debug':True} if args == ['--debug'] else {'debug':False} service = examplepackage.examplemodule.make_service(**options)

Only 2 of which are relevent for running a twistd daemon other than 'example'

So the goal of my post to this mailing list is:

* I would like glyph's goal of having less arbitary code executed at twistd launch time to become a realisation,

Makes sense. My first inclination is to suggest creating metadata files that are found via pkg_resources.

...

* I would like the process of creating a twisted plugin to be less of a cut+paste+fill-in-blanks hassle.

This doesn't bother me so much. To go back in time a bit:

...

First, the reason i use twisted plugins is because they're the a way to easily do sensible things with logging, daemonisation and interaction from init.d files.

* The number of imports required to compose a plugin is annoying. 2 interfaces from two different packages, plus needing zope.interface.implements.

* I've never liked the twisted arg parser, I use it only grudingly, it would be nice to be able to throw argv at my make_service call.

I've got no issues with t.p.usage, and tend to keep its definitions in the plugin class. But I agree you should be able to skip it, although I have a suspicion you probably already can by being sneaky.

...

* The entire task of having this python plugin is to link up the metadata with a more or less standard

Not sure if I get this part. I tend to think of plugins as the service bootstrap file, like an int main(). I would probably *not* be into the idea of passing sys.argv directly to a service, for example.

...

* It's very easy to accidentally make your plugin load your package for every other twistd daemon running out of the same plugin cache.

Never had this happen, but I think I can see where it might. It would be solved by some kind of external metadata, though, right?

...

* The practice of putting a module under twisted/plugins/ $mymodulehere.py upsets my equilibrium, the only reason I tolerate this kind of thing is that I install my python code via rpms and I automatically get installation, ownership and uninstallation done in a stable manner.

The only issue I've found with this is the issue of having to create plugin cache files. Apart from the spurious error messages that were (partially?) covered at the sprint recently, the benefit they provide is definitely negated by bad permissions. I've done some acrobatics inside my setup.py to make it work, but it involves different steps depending on whether you're installing or building a package. Still, it seems like most of their necessity would be negated by using external metadata files. It seems to be the way most plugin systems end up going in some way or another. Anyways, that's just my 2/100ths. I'm pretty happy with most of the plugin model, but I think there's a lot of room for improvement, particularly in the area of dropin.cache files. -phil

Glyph Lefkowitz

5:03 a.m.

On Apr 7, 2011, at 12:06 AM, Phil Christensen wrote:

...

On Apr 6, 2011, at 8:35 PM, Stephen Thorne wrote:

...
For your consideration, and (constructive) critcism, here is a twisted plugin that is nearly identical to 6 that I have running in production: [snip] serviceMaker = ExampleServiceMaker()

From these 19 lines of code there are 4 things that are relevent: tapname = example description = 'Example Twistd Plugin' options = {'debug':True} if args == ['--debug'] else {'debug':False} service = examplepackage.examplemodule.make_service(**options)

Only 2 of which are relevent for running a twistd daemon other than 'example'

So the goal of my post to this mailing list is:

* I would like glyph's goal of having less arbitary code executed at twistd launch time to become a realisation,

Makes sense. My first inclination is to suggest creating metadata files that are found via pkg_resources.

We already use a similar mechanism, twisted.python.modules, which uses the same underlying standards as pkg_resources (PEP 302) but is somewhat more flexible. And we create a metadata file (dropin.cache) which is stored and retrieved using this mechanism.

...

...
* The entire task of having this python plugin is to link up the metadata with a more or less standard

Not sure if I get this part. I tend to think of plugins as the service bootstrap file, like an int main(). I would probably *not* be into the idea of passing sys.argv directly to a service, for example.

Why not? It's just a list of strings. You should be able to deal with it how you like. (But as I said in a previous message: this is a separate issue.)

...

...
* It's very easy to accidentally make your plugin load your package for every other twistd daemon running out of the same plugin cache.

Never had this happen, but I think I can see where it might. It would be solved by some kind of external metadata, though, right?

It already is solved by the external metadata... sort of. If you look at the implementation of CachedPlugin, you can see that it actually already has a name and description! There are two problems though: first is that this is hard-coded to be the module's name and docstring, but more importantly, there's just no way to get at those attributes via the getPlugin interface, which implicitly invokes '__conform__' via adaptation (and therefore load()).

...

* The practice of putting a module under twisted/plugins/

...
$mymodulehere.py upsets my equilibrium, the only reason I tolerate this kind of thing is that I install my python code via rpms and I automatically get installation, ownership and uninstallation done in a stable manner.

The only issue I've found with this is the issue of having to create plugin cache files. Apart from the spurious error messages that were (partially?) covered at the sprint recently, the benefit they provide is definitely negated by bad permissions. I've done some acrobatics inside my setup.py to make it work, but it involves different steps depending on whether you're installing or building a package.

What is "this kind of thing", though? The plugins have to go in some defined namespace in order to be enumerated. Even if we were to implement something based on purely static metadata, you'd still have to list a directory to get at that metadata. Making the namespace be owned by the module doing the importing makes sense. For what it's worth, bzrlib does this too, but by convention in a bzr plugin you put _all_ your code into bzrlib/plugins/foo/*.py, and your package is bzrlib.plugins.foo. (This will work fine with Twisted if you want to do it that way.)

...

Still, it seems like most of their necessity would be negated by using external metadata files. It seems to be the way most plugin systems end up going in some way or another.

dropin.cache is an external metadata file ;-).

...

Anyways, that's just my 2/100ths. I'm pretty happy with most of the plugin model, but I think there's a lot of room for improvement, particularly in the area of dropin.cache files.

Thanks for your feedback.

Tim Allen

4:31 a.m.

On Thu, Apr 07, 2011 at 10:35:18AM +1000, Stephen Thorne wrote:

...

So the goal of my post to this mailing list is:

* I would like glyph's goal of having less arbitary code executed at twistd launch time to become a realisation,

* I would like the process of creating a twisted plugin to be less of a cut+paste+fill-in-blanks hassle.

I notice that Tarek Ziadé's "distutils2" is moving away from "load a Python module and probe for well-known attributes" toward "define all metadata in a static file". It sounds like you want a similar thing for twistd plugins. Perhaps an implementation might look something like this: - At startup, twistd scans twisted/plugin directories on sys.path looking for files whose filenames end with '.twistd'. - Each such file is loaded with Python's ConfigParser module. - Each section in the ConfigParser module represents a plugin whose 'tapname' is the section name. - Each section has a 'description' option, whose value is a human-readable string describing the plugin. - Each section has a 'module' option, whose value is a string that can be passed to Python's __import__ builtin to get a Python module. ...where the module defined by 'module' exposes a 'make_service(options)' function, and an 'options' global variable that is an instance of t.p.usage.Options. I know you said you didn't like t.p.usage.Options, but I'd be sad to lose the ability for twistd to support "twistd $PLUGIN --help", and for that kind of introspection to work, the options data needs to be in *some* known format. Maybe this might be the time to move to the stdlib's optparse - or maybe not, now that optparse is (presumably) deprecated in favour of argparse. Maybe twistd could examine a number of different well-known variable-names, for different option-parsing libraries. Just tossing this out as a strawman for people to point and laugh at.

Glyph Lefkowitz

4:54 a.m.

On Apr 7, 2011, at 12:31 AM, Tim Allen wrote:

...

On Thu, Apr 07, 2011 at 10:35:18AM +1000, Stephen Thorne wrote:

...
So the goal of my post to this mailing list is:

* I would like glyph's goal of having less arbitary code executed at twistd launch time to become a realisation,

* I would like the process of creating a twisted plugin to be less of a cut+paste+fill-in-blanks hassle.

I notice that Tarek Ziadé's "distutils2" is moving away from "load a Python module and probe for well-known attributes" toward "define all metadata in a static file". It sounds like you want a similar thing for twistd plugins. Perhaps an implementation might look something like this:

- At startup, twistd scans twisted/plugin directories on sys.path looking for files whose filenames end with '.twistd'.

While I'm sympathetic to the goal here, I don't like this particular implementation strategy for several reasons. Right now, in order to properly install a 'twistd' plugin (including those that come with Twisted), you have to do two things: 1. install some .py files into a package 2. as the user doing the installation (probably root), run the one-liner at the bottom of http://twistedmatrix.com/documents/11.0.0/core/howto/plugin.html#auto3. If invoking python code in your installation process is too hard, this can be approximated with 'twistd --help 2>&1 > /dev/null'. This is rocket science. Nobody can manage it. Seriously. After literally _years_ of fighting with conflicting python installation techniques in Debian and Ubuntu, I think that we finally have something that works about half of the time. I haven't checked up on RedHat in a while and I don't know if they have a working system to do this yet, but they didn't last I checked. If we invent our own file extension which has to be separately installed, we have to teach distutils, and setuptools, and distribute, and pip, and distutils2, and 'packaging' (as I'm sure that will eventually be incompatible with distutils2 for some silly reason), and easy_install, and dpkg, and rpm, and yum, and apt, and probably five other horrible Python packaging things that I don't even know about yet, how to deal with it. So I am strongly in favor of keeping everything in .py files and just making a minor tweak to what's stored in dropin.cache (and perhaps allowing dropin.cache to be stored in some location more likely to be writable by individual users, in case the installation process doesn't update it). Because, frankly, Python installation tools REALLY REALLY SHOULD be able to install Python files into Python packages. I'm not sure I can make any other assertions quite so strongly. I'm pretty sure that this is a problem that more than one project is interested in solving. No other projects are interested in installing '.twistd' files though, I can assure you of that :).

...

- Each such file is loaded with Python's ConfigParser module.

The first rule of the Twisted cabal is of course "don't talk about the Twisted cabal", but the second and possibly even more important rule is "no '.ini' files". I'd seriously much rather we use XML. And you can ask Stephen how he feels about XML configuration files. (Although I'd strongly recommend standing well clear of him when you do that, and making sure that no sharp or otherwise dangerous objects are within easy reach.)

...

- Each section blah blah terrible user interface stuff about 'sections' and other misfeatures of ini files.

I don't want a solution that is hard-coded to deal with the metadata that 'twistd' specifically needs, as Twisted plugins are already used for more than just twistd plugins, and I'd like them to be used for even more. An interface that limits the metadata to ConfigParser sections would make it awkward to fit into a management GUI or web page. Plus, the quoting rules for long strings in ini files make it unsuitable for storing long descriptive strings (which is one of the primary use-cases here).

...

...where the module defined by 'module' exposes a 'make_service(options)' function, and an 'options' global variable that is an instance of t.p.usage.Options.

I know you said you didn't like t.p.usage.Options, but I'd be sad to lose the ability for twistd to support "twistd $PLUGIN --help", and for that kind of introspection to work, the options data needs to be in *some* known format. Maybe this might be the time to move to the stdlib's optparse - or maybe not, now that optparse is (presumably) deprecated in favour of argparse. Maybe twistd could examine a number of different well-known variable-names, for different option-parsing libraries.

For what it's worth, I don't care about this at all. It's a completely separate issue from the main stuff I care about, and while we should be able to simply delegate to a function that takes a list of strings, I will insist that we fix that separately.

...

Just tossing this out as a strawman for people to point and laugh at.

Ha, ha ha ha. (point). Thanks for the feedback though. These are all very common suggestions, and I'm glad for the opportunity to point out why we haven't already done them. So as not to make this message too long, I'll defer a description my own preferred implementation strategy for a future post to this thread. -glyph

Tim Allen

5:08 a.m.

On Thu, Apr 07, 2011 at 12:54:45AM -0400, Glyph Lefkowitz wrote:

...

If we invent our own file extension which has to be separately installed, we have to teach distutils, and setuptools, and distribute, and pip, and distutils2, and 'packaging' (as I'm sure that will eventually be incompatible with distutils2 for some silly reason), and easy_install, and dpkg, and rpm, and yum, and apt, and probably five other horrible Python packaging things that I don't even know about yet, how to deal with it. So I am strongly in favor of keeping everything in .py files and just making a minor tweak to what's stored in dropin.cache (and perhaps allowing dropin.cache to be stored in some location more likely to be writable by individual users, in case the installation process doesn't update it).

My understanding was that .py files have to be installed into twisted/plugins as binary blobs, not as ordinary Python modules, because of special rules like "twisted/plugins must not be a Python package". If distutils/setuptools/etc. can handle a binary blob with a ".py" extension, I figured it could handle a binary blob with any other extension. If that's wrong, then yeah, I guess that would be a problem.

...

...
- Each such file is loaded with Python's ConfigParser module.

The first rule of the Twisted cabal is of course "don't talk about the Twisted cabal", but the second and possibly even more important rule is "no '.ini' files". I'd seriously much rather we use XML. And you can ask Stephen how he feels about XML configuration files. (Although I'd strongly recommend standing well clear of him when you do that, and making sure that no sharp or otherwise dangerous objects are within easy reach.)

Well, the nice thing about ConfigParser is that it's in the stdlib, and people already know how to create them, and rolling yet-another-config-file-format seems crazy in this day and age. If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps.

...

I don't want a solution that is hard-coded to deal with the metadata that 'twistd' specifically needs, as Twisted plugins are already used for more than just twistd plugins, and I'd like them to be used for even more.

I've never actually come across anything that used Twisted plugins besides twistd, so I'd forgotten they weren't twistd-specific.

...

So as not to make this message too long, I'll defer a description my own preferred implementation strategy for a future post to this thread.

I'll look forward to it. :) Tim.

Glyph Lefkowitz

5:46 a.m.

On Apr 7, 2011, at 1:08 AM, Tim Allen wrote:

...

Well, the nice thing about ConfigParser is that it's in the stdlib, and people already know how to create them, and rolling yet-another-config-file-format seems crazy in this day and age.

My point was really that people think they know how to create these, but actually they don't. Pop quiz, hot shot: what is the quoting rule to put a linebreak with preserved trailing whitespace into a value in a .ini file? Into a key?

...

If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps.

Pickle, of course.

Stephen Thorne

5:55 a.m.

On 2011-04-07, Glyph Lefkowitz wrote:

...

On Apr 7, 2011, at 1:08 AM, Tim Allen wrote:

...
Well, the nice thing about ConfigParser is that it's in the stdlib, and people already know how to create them, and rolling yet-another-config-file-format seems crazy in this day and age.

My point was really that people think they know how to create these, but actually they don't. Pop quiz, hot shot: what is the quoting rule to put a linebreak with preserved trailing whitespace into a value in a .ini file? Into a key?

Reality check. This are the plugins that are currently shipped: ftp An FTP server. telnet A simple, telnet-based remote debugging service. socks A SOCKSv4 proxy service. manhole-old An interactive remote debugger service. portforward A simple port-forwarder. web A general-purpose web server which can serve from a filesystem or application resource. inetd An inetd(8) replacement. news A news server. xmpp-router An XMPP Router server words A modern words server dns A domain name server. mail An email service manhole An interactive remote debugger service accessible via telnet and ssh and providing syntax coloring and basic line editing functionality. conch A Conch SSH service. procmon A process watchdog / supervisor Why do we care about complex quoting and linebreaks for descriptions? If you can't remember, just keep typing and let it get wrapped. -- Regards, Stephen Thorne Development Engineer Netbox Blue

Glyph Lefkowitz

6:04 a.m.

On Apr 7, 2011, at 1:55 AM, Stephen Thorne wrote:

...

On 2011-04-07, Glyph Lefkowitz wrote:

...
On Apr 7, 2011, at 1:08 AM, Tim Allen wrote:

...
Well, the nice thing about ConfigParser is that it's in the stdlib, and people already know how to create them, and rolling yet-another-config-file-format seems crazy in this day and age.

My point was really that people think they know how to create these, but actually they don't. Pop quiz, hot shot: what is the quoting rule to put a linebreak with preserved trailing whitespace into a value in a .ini file? Into a key?

Reality check. This are the plugins that are currently shipped:

ftp An FTP server. telnet A simple, telnet-based remote debugging service. socks A SOCKSv4 proxy service. manhole-old An interactive remote debugger service. portforward A simple port-forwarder. web A general-purpose web server which can serve from a filesystem or application resource. inetd An inetd(8) replacement. news A news server. xmpp-router An XMPP Router server words A modern words server dns A domain name server. mail An email service manhole An interactive remote debugger service accessible via telnet and ssh and providing syntax coloring and basic line editing functionality. conch A Conch SSH service. procmon A process watchdog / supervisor

Why do we care about complex quoting and linebreaks for descriptions? If you can't remember, just keep typing and let it get wrapped.

None of those descriptions have non-ASCII characters in there either, but that doesn't mean I want to standardize on a format where I can't figure out how to type them. I would like to provide more flexibility with a simpler API, not less flexibility and more complexity.

David

6:24 a.m.

On 04/07/2011 02:08 PM, Tim Allen wrote:

...

If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps.

Having had experience with JSON for configuration: it is a terrible format for configuration, if only because it does not support comments. The syntax is also a bit too strict: enough to be annoying in something you want to edit all the time and easily in my experience. cheers, David

Tim Allen

6:34 a.m.

On Thu, Apr 07, 2011 at 03:24:57PM +0900, David wrote:

...

On 04/07/2011 02:08 PM, Tim Allen wrote:

...
If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps.

Having had experience with JSON for configuration: it is a terrible format for configuration, if only because it does not support comments.

The syntax is also a bit too strict: enough to be annoying in something you want to edit all the time and easily in my experience.

Well, that's pretty depressing. The only other candidate I can even think of is YAML, and that's not in the standard library (as far as I know). Who'd have guessed it'd be so complicated to associate keys with values?

David

6:38 a.m.

On 04/07/2011 03:34 PM, Tim Allen wrote:

...

Who'd have guessed it'd be so complicated to associate keys with values?

If that's the only thing you need, .ini would work fine. Another solution would be python files with only literals, parsed through the ast module for safety. cheers, David

Jason Rennie

12:02 p.m.

On Thu, Apr 7, 2011 at 2:34 AM, Tim Allen wrote:

...

Well, that's pretty depressing. The only other candidate I can even think of is YAML, and that's not in the standard library (as far as I know).

There's Coil, but it's also not in the std lib AFAIK: http://mike.marineau.org/coil/ Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/

Johan Rydberg

9:35 a.m.

On 4/7/11 8:24 AM, David wrote:

...

Having had experience with JSON for configuration: it is a terrible format for configuration, if only because it does not support comments.

The syntax is also a bit too strict: enough to be annoying in something you want to edit all the time and easily in my experience.

I agree. We use json as config-file format from time to time, but it always end up hurting you. I therefor hacked up this little library: https://github.com/edgeware/structprop

Stephen Thorne

5:38 a.m.

On 2011-04-07, Glyph Lefkowitz wrote:

...

Because, frankly, Python installation tools REALLY REALLY SHOULD be able to install Python files into Python packages. I'm not sure I can make any other assertions quite so strongly. I'm pretty sure that this is a problem that more than one project is interested in solving. No other projects are interested in installing '.twistd' files though, I can assure you of that :).

This is entirely wrong. Python installation tools are barely capable of putting entire existing working python packages into a directory that if you mumble rhymes with "kite smackages". To expect installation tools to be able to put a python-script into a nested subdirectory of an entirely different subtree without putting in __init__.py files, and working with .pth files, and not being insane, you have to hack things. kite-smackages/twisted/plugins/myplugin.py installed with any standard tool without simply hard-coding is a disaster. Putting it outside of that directory, more so. -- Regards, Stephen Thorne Development Engineer Netbox Blue

Glyph Lefkowitz

5:45 a.m.

On Apr 7, 2011, at 1:38 AM, Stephen Thorne wrote:

...

On 2011-04-07, Glyph Lefkowitz wrote:

...
Because, frankly, Python installation tools REALLY REALLY SHOULD be able to install Python files into Python packages. I'm not sure I can make any other assertions quite so strongly. I'm pretty sure that this is a problem that more than one project is interested in solving. No other projects are interested in installing '.twistd' files though, I can assure you of that :).

This is entirely wrong. Python installation tools are barely capable of putting entire existing working python packages into a directory that if you mumble rhymes with "kite smackages".

This is why I said "should". Python installation tools are basically incapable of anything. And yet. My point is that it's hard enough to advocate for bugs to be fixed in installing .py files; let's focus on that, and avoid installing other stuff.

Stephen Thorne

5:54 a.m.

On 2011-04-07, Glyph Lefkowitz wrote:

...

On Apr 7, 2011, at 1:38 AM, Stephen Thorne wrote:

...
On 2011-04-07, Glyph Lefkowitz wrote:

...
Because, frankly, Python installation tools REALLY REALLY SHOULD be able to install Python files into Python packages. I'm not sure I can make any other assertions quite so strongly. I'm pretty sure that this is a problem that more than one project is interested in solving. No other projects are interested in installing '.twistd' files though, I can assure you of that :).

This is entirely wrong. Python installation tools are barely capable of putting entire existing working python packages into a directory that if you mumble rhymes with "kite smackages".

This is why I said "should". Python installation tools are basically incapable of anything. And yet.

My point is that it's hard enough to advocate for bugs to be fixed in installing .py files; let's focus on that, and avoid installing other stuff.

Everything has the capability to put datafiles somewhere, even if the location offset is annoying. In fact, most tools are /better/ at installing arbitary files that don't have a first line of #!python or .py extension than files that do. For this reason, a static configuration file, such as a hand crafted pickle, an xml file, or an ini file, is a reasonably viable alternative. I hesitate to suggest it, but a .pth file or a setuptools entrypoint may be an option here too for inserting plugins without writing to a doubly nested non-package twisted ''package'' plugins directory. -- Regards, Stephen Thorne Development Engineer Netbox Blue

Glyph Lefkowitz

6:08 a.m.

On Apr 6, 2011, at 8:35 PM, Stephen Thorne wrote:

...

Part of the discussion was about how to rewrite this in such a way that no python code needs to be run in order to discover all the tapname+description combinations that are available to twistd, this is because of a perceived performance and sanity deficit in using 'twistd'.

My interest in this discussion is not so much in "no python code should be executed" but rather "the current constraints of the system should be preserved (your whole package doesn't get imported) but you shouldn't have to write hacks like ServiceMaker (http://twistedmatrix.com/documents/11.0.0/api/twisted.application.service.Se...)to preserve them". Or, for that matter, do inner imports, like this one from your example:

...

def makeService(self, options): from examplepackage.examplemodule import make_service return make_service(debug=options['debug'])

Someone unfamiliar with the Twisted plugin system would probably not realize that the positioning of that import is critically important. It seems kind of random, and maybe sloppy, and a refactoring for stylistic fixes might move it to the top of the module. Of course, such a refactoring would make 'twistd --help' on any system with your code installed start executing gobs and gobs of additional code. Also, as a result of such a change, every 'twistd' server on such a system would have your entire examplepackage.examplemodule imported, silently of course, increasing their memory footprint and so on. As I have mentioned in other parts of this mailing list thread, there's already some caching going on, but it's never used. Observe: glyph@... twisted/plugins$ python Python 2.6.1 (...)

...

...
...
from cPickle import load plugins = load(file('dropin.cache')) plugins['twisted_names'].plugins [] plugins['twisted_names'].plugins[0].name 'TwistedNames' plugins['twisted_names'].plugins[0].description '\n Utility class to simplify the definition of L{IServiceMaker} plugins.\n ' plugins['twisted_names'].plugins[0].provided [<InterfaceClass twisted.plugin.IPlugin>, <InterfaceClass twisted.application.service.IServiceMaker>] import sys 'twisted.plugins' in sys.modules False

The problem with this is that once you've loaded the plugins, you can't see it any more:

...

...
...
from twisted.plugin import getPlugins from twisted.application.service import IServiceMaker allPlugins = list(getPlugins(IServiceMaker)) plugin = [p for p in allPlugins if p.tapname == 'dns'][0] plugin.description 'A domain name server.' plugin.name 'Twisted DNS Server'

Those are the 'name' and 'description' attributes from the IServiceMaker provider, already implicitly loaded by getPlugins. You can't see the CachedPlugin any more. So, here's an idea, very similar to the one on the ticket. Keeping in mind the state described above, hopefully it will communicate my idea better. Right now, IPlugin is purely a marker. It provides no methods. I propose a new subinterface (designed to eventually replace it), IPlugin2, with one method, 'metadata()', that returns a dictionary mapping strings to strings. This _could_ be any object, limited only by what we think is a good idea to allow serializing. The second method would be 'willProvide(I)' which returns a boolean, whether the result of load() will provide the interface 'I'. Then there's a helper which you inherit which looks like: class Plugin2(object): implements(IPlugin2) def metadata(self): raise NotImplementedError("your metadata here") def willProvide(self, I): return I.providedBy(self) def load(self): return self The one rule here is that 'metadata()' must always return the same value for a particular version of the code. We will then serialize the metadata from calling metadata() into dropin.cache, and expose it to application code. My idea for exposing it is that if you then do 'getPlugins(IPlugin2)', you will get back an iterable of IPlugin2 providers, but not necessarily instances of your classes: they could be cached plugins, with cached results for metadata() and willProvide() - the latter based on the list currently saved as the 'provided' attribute. So a loop like this to load a twistd plugin by name: def twistdPluginByTapname(name): for p2 in getPlugins(IPlugin2): if p2.willProvide(IServiceMaker) and p2.metadata()['tapname'] == name: return p2.load() ... would not actually load any plugins, but work entirely from the cached metadata. Since you wouldn't be loading the plugin except to actually invoke its dynamic behavior, we would no longer need ServiceMaker, just an instance of the actual IServiceMaker plugin, with no local imports or anything. This would at least partially address one of your complaints, Stephen, in that it would mean that a plugin could be defined with 2 lines: import your class, and create an instance of it. Of course you'd still need boilerplate somewhere, but it would be possible to put a big pile of them in one place, or define some common stuff in a utility module, and not need to dance around avoiding importing it. As a separate consideration, once this API is in place, it isn't all that important that we generate that initial metadata by importing the Python code the way that we do now. The metadata could be manually specified. I think that would be a good first step, but we could, for example, put the metadata in some human-readable format rather than pickle. JSON, I guess, is what's hip with the kids these days. Or, if you philistines really won't quit, an .ini file. But don't tell me I didn't warn you ;-). The actual list of plugins could be generated from these data files as well. But, if we were to put this kind of extra metadata into a data file right now, the current API wouldn't give you any way to access it.

Itamar Turner-Trauring

12:14 p.m.

On Thu, 2011-04-07 at 02:08 -0400, Glyph Lefkowitz wrote:

...

My idea for exposing it is that if you then do 'getPlugins(IPlugin2)', you will get back an iterable of IPlugin2 providers, but not necessarily instances of your classes: they could be cached plugins, with cached results for metadata() and willProvide() - the latter based on the list currently saved as the 'provided' attribute. So a loop like this to load a twistd plugin by name:

def twistdPluginByTapname(name): for p2 in getPlugins(IPlugin2): if p2.willProvide(IServiceMaker) and p2.metadata()['tapname'] == name: return p2.load()

... would not actually load any plugins, but work entirely from the cached metadata.

That's where the whole idea falls down for me. Evidence suggests (and you note this earlier) that caching doesn't work anywhere in the real world. My current Ubuntu install complains about a read-only cache every time I run lore (and I'm pretty sure there's nothing added to my PYTHONPATH other than installed system packages). Any design which assumes caching works appears to be useless in the real world. So, the design has to *not* rely on caching working.

Andrew Bennetts

1:19 p.m.

Itamar Turner-Trauring wrote: […]

...

So, the design has to *not* rely on caching working.

FWIW: this is an achievable goal. I have 32 different bzr plugins currently installed, and here's the difference they make: $ time bzr --no-plugins rocks It sure does! real 0m0.075s $ time bzr rocks It sure does! real 0m0.119s So that's about 1.5ms per plugin, on average. With a hot disk cache, at least… For comparison, 'twistd --version' takes 116ms, with a dropin.cache and (I think, although how can I tell?) no plugins installed. In part, we achieve this via the bzrlib.lazy_import hack, which plugins can and often do use, and by encouraging plugin authors to put as little code into their __init__.py files as possible. A typical plugin's __init__ might do just: # This is example_plugin/__init__.py # The actual command implementation is in # example_plugin/example_commands.py from bzrlib import commands commands.plugin_cmds.register_lazy('cmd_class_name', [], 'bzrlib.plugins.example_plugin.example_commands') Glyph's expressed scepticism that plugin authors and maintainers will know to keep their __init__.py files cheap to import. Bazaar's experience is different. Partly that's probably because the Bazaar community has paid a fair bit of attention to start up time and I suppose Twisted doesn't have that. But I think also it's partly because we've provided tools to help people diagnose what/who to blame for bzr being slow to start, like 'bzr --profile-imports', and even the crude 'time bzr rocks'. -Andrew.

Glyph Lefkowitz

7:27 p.m.

On Apr 7, 2011, at 9:19 AM, Andrew Bennetts wrote:

...

Itamar Turner-Trauring wrote: […]

...
So, the design has to *not* rely on caching working.

FWIW: this is an achievable goal. I have 32 different bzr plugins currently installed, and here's the difference they make:

$ time bzr --no-plugins rocks It sure does!

real 0m0.075s

$ time bzr rocks It sure does!

real 0m0.119s

So that's about 1.5ms per plugin, on average. With a hot disk cache, at least…

Is your cache as hot for Twisted as for bzr? Have you replicated these results in a randomized, double-blind clinical trial? ;-) I'm not surprised that bzr has faster startup though; twistd has not been (and doubtful will ever be) nearly so ruthlessly optimized. Maybe it's time to put a startup benchmark on http://speed.twistedmatrix.com/, at least that way we could keep track.

...

For comparison, 'twistd --version' takes 116ms, with a dropin.cache and (I think, although how can I tell?) no plugins installed.

Twisted itself installs 22 dropins (python files which each define at least one plugin), which comprise 48 plugins of various types, so there are always some. You should be able to tell, though. It's pathetic that we don't have a command-line tool to inspect the available plugins and what they're doing. Independent of the other issues under discussion here: http://twistedmatrix.com/trac/ticket/5039. But this is all moot. 'twistd --version' doesn't scan for plugins, so that's all just the normal startup time; apparently we import too much in the first place. The thing to compare with is 'twistd --help' or even just 'twistd [some-plugin]' (since invoking one plugin actually loads all of them). Plus - this is really the genesis for this thread - the dropin.cache isn't really saving us much work at all right now, because all the plugins get loaded anyway for all practical uses of plugin scanning.

...

In part, we achieve this via the bzrlib.lazy_import hack, which plugins can and often do use, and by encouraging plugin authors to put as little code into their __init__.py files as possible. A typical plugin's __init__ might do just:

# This is example_plugin/__init__.py # The actual command implementation is in # example_plugin/example_commands.py from bzrlib import commands commands.plugin_cmds.register_lazy('cmd_class_name', [], 'bzrlib.plugins.example_plugin.example_commands')

This looks very similar to ServiceMaker.

...

Glyph's expressed scepticism that plugin authors and maintainers will know to keep their __init__.py files cheap to import. Bazaar's experience is different. Partly that's probably because the Bazaar community has paid a fair bit of attention to start up time and I suppose Twisted doesn't have that.

Yeah, bzr's audience makes this easier. For one thing, the audience is much bigger :), but more importantly, bzr is a user-facing tool which users are running _constantly_ at the command line. The only visible consequence of a rogue twistd plugin is that your server which runs for days at a time takes 0.2s longer to start; the real problem sets in later, where your 25 subprocesses are suddenly consuming an additional 50meg each because of the extra plugin they loaded. You do find this eventually, it's just rare to find it while you're writing the plugin.

...

But I think also it's partly because we've provided tools to help people diagnose what/who to blame for bzr being slow to start, like 'bzr --profile-imports', and even the crude 'time bzr rocks'.

Yes. These are a great idea, and there's no excuse that Twisted's plugin system is so difficult to inspect and debug. A couple of good tools would address a wide range of plugin issues, many of them much more interesting than performance, like the ever-popular "why isn't my plugin getting loaded". Thanks for the impetus to file the ticket above. (I kinda hope it's a dup, but I couldn't find one.)

Glyph Lefkowitz

6:20 p.m.

On Apr 7, 2011, at 8:14 AM, Itamar Turner-Trauring wrote:

...

On Thu, 2011-04-07 at 02:08 -0400, Glyph Lefkowitz wrote:

...
My idea for exposing it is that if you then do 'getPlugins(IPlugin2)', you will get back an iterable of IPlugin2 providers, but not necessarily instances of your classes: they could be cached plugins, with cached results for metadata() and willProvide() - the latter based on the list currently saved as the 'provided' attribute. So a loop like this to load a twistd plugin by name:

def twistdPluginByTapname(name): for p2 in getPlugins(IPlugin2): if p2.willProvide(IServiceMaker) and p2.metadata()['tapname'] == name: return p2.load()

... would not actually load any plugins, but work entirely from the cached metadata.

That's where the whole idea falls down for me. Evidence suggests (and you note this earlier) that caching doesn't work anywhere in the real world. My current Ubuntu install complains about a read-only cache every time I run lore (and I'm pretty sure there's nothing added to my PYTHONPATH other than installed system packages). Any design which assumes caching works appears to be useless in the real world.

So, the design has to *not* rely on caching working.

Here's an idea: let's make caching actually work :). Prior experience indicates that with some small amount of dedication, it's possible to make a module in Twisted not be broken all the time. As you observed that I already mentioned earlier in the thread, caching never works because post-installation hooks are such a pain, and you have to have special permissions to access the cache file. So, separately from this, we could attempt a secondary cache read/write to a location much more likely to be writable by the user (something like ~/.local/var/cache/usr_lib_python2.6_site-packages.dropin.cache) read if the first one is out of date and written if writing the first one fails. Also: we already rely on this behavior, so things are just as broken now for you. For example, you'll end up loading the code for all twistd plugins and trial reporters when what you want are lore plugins. This could also be fixed independently. (To fix your particular installation right now, 'sudo twistd --help' or 'sudo lore' once.) And, finally, as a separate consideration, we could make "cached metadata" mean "explicitly specified metadata" instead. The important thing that I'm talking about doing first is making the system work exactly the same way that it does now, with one additional feature in the API which would allow us to make use of metadata that lives outside the Python code, using the existing mechanism for storing metadata that is currently not used. For a first cut, we wouldn't even remove the ServiceMaker hack, just add the new feature to it so that we could do slightly less importing at startup.

Marcin Kasperski

19 Apr 19 Apr

2:27 p.m.

...

Part of the discussion was about how to rewrite this in such a way that no python code needs to be run in order to discover all the tapname+description combinations that are available to twistd, this is because of a perceived performance and sanity deficit in using 'twistd'.

Have you considered using setuptools entry_points? They are de facto standard and work fairly well for tools like paster or sqlalchemy...

exarkun＠twistedmatrix.com

4:13 p.m.

On 02:27 pm, marcin.kasperski@mekk.waw.pl wrote:

...

...
Part of the discussion was about how to rewrite this in such a way that no python code needs to be run in order to discover all the tapname+description combinations that are available to twistd, this is because of a perceived performance and sanity deficit in using 'twistd'.

Have you considered using setuptools entry_points? They are de facto standard and work fairly well for tools like paster or sqlalchemy...

I don't think setuptools entry_points are expressive enough to be used here. However, regardless, due to problems with setuptools, I don't think Twisted should gain a non-optional dependency on it (as it would be for something as core as twistd plugins). If distribute makes it into the standard library (circa Python 3.3) then it might be reasonable to consider depending on it, if it actually manages to fix the issues it initially inherited from setuptools. Jean-Paul

Glyph Lefkowitz

6:06 p.m.

On Apr 19, 2011, at 12:13 PM, exarkun@twistedmatrix.com wrote:

...

On 02:27 pm, marcin.kasperski@mekk.waw.pl wrote:

...
...
Part of the discussion was about how to rewrite this in such a way that no python code needs to be run in order to discover all the tapname+description combinations that are available to twistd, this is because of a perceived performance and sanity deficit in using 'twistd'.

Have you considered using setuptools entry_points? They are de facto standard and work fairly well for tools like paster or sqlalchemy...

I don't think setuptools entry_points are expressive enough to be used here. However, regardless, due to problems with setuptools, I don't think Twisted should gain a non-optional dependency on it (as it would be for something as core as twistd plugins).

Strongly agreed on both counts. For a long time I wished that we could be more 'standard' in this regard, but the more I learned about how entrypoints actually work, the less I like them.

...

If distribute makes it into the standard library (circa Python 3.3) then it might be reasonable to consider depending on it, if it actually manages to fix the issues it initially inherited from setuptools.

I don't believe 'distribute' is ever making it into the standard library. The thing going into python 3.3 is 'packaging', which, obviously, is a copy (hopefully unmodified) of 'distutils2', which has nothing in common with 'distribute' except for its author. 'distribute' is a fork of setuptools that is actively maintained. 'distutils2' is a replacement for distutils (as I understand it, a rewrite) that does a bunch of things differently. More info here: https://bitbucket.org/tarek/distutils2/wiki/Home.

Kevin Horn

21 Apr 21 Apr

1:48 a.m.

...

I don't believe 'distribute' is ever making it into the standard library. The thing going into python 3.3 is 'packaging', which, obviously, is a copy (hopefully unmodified) of 'distutils2', which has nothing in common with 'distribute' except for its author.

'distribute' is a fork of setuptools that is actively maintained. 'distutils2' is a replacement for distutils (as I understand it, a rewrite) that does a bunch of things differently.

More info here: https://bitbucket.org/tarek/distutils2/wiki/Home.

This is essentially correct. A little history for the interested (feel free to skip it): First there was setuptools. Everyone used it, but the author wasn't so great at keeping it maintained, since it was originally just something he wrote for himself. But since he used it a lot in his business, he didn't want lots of other people making changes either, so it fell into disrepair. Tarek came along and got busy. He (along with some other people) forked setuptools and made distribute. He fixed some bugs, and planned to eventually clean things up and change the underlying API strangeness. But the further he got into it, the more he ran into problems with all the crazy extensions setuptools/distribute made to distutils. So he thought "Aha! I can just clean up distutils and things will be so much easier!". Well the joke was on him, cuz he discovered distutils was a complete cluster****. So he rewrote distutils with an eye on keeping things nice for everyone. Project managers, distro packagers, users installing software, etc. This is distutils2. In Python 3.3 and up it will be called "packaging". Once people start using it, it will make a lot of the current packaging headaches in the Python world go away. But the one thing it does NOT do, that setuptools/distribute DID, is the entry_points stuff, which according to Tarek should be in a separate package anyway. So long story short (too late), once distutils2/packaging drops, the idea will be for everyone to eventually move to using it rather that distutils, and using a setup.cfg rather than a setup.py. Also setuptools/distribute will hopefully go back to being a niche tool. Also also the Python community could use a plugin standard that could replace setuptools' entry_points. The end. Kevin Horn

anatoly techtonik

9:21 p.m.

On Thu, Apr 21, 2011 at 4:48 AM, Kevin Horn wrote:

...

cluster****. So he rewrote distutils with an eye on keeping things nice for everyone. Project managers, distro packagers, users installing software, etc. This is distutils2. In Python 3.3 and up it will be called "packaging". Once people start using it, it will make a lot of the current packaging headaches in the Python world go away.

Are you sure about that? Where is the list of stories it will solve when people start using it, so I can check that my cases are covered by distutils2 insurance plan?

...

will hopefully go back to being a niche tool. Also also the Python community could use a plugin standard that could replace setuptools' entry_points.

ABC classes? Trac components? What is plugin anyway? Discoverable module on a syspath? Registered entity in package repository? Filename in %HOMEDIR% that starts with tx.? (autoloaders from PHP5?) -- anatoly t.

David

22 Apr 22 Apr

1:01 a.m.

On 04/22/2011 06:21 AM, anatoly techtonik wrote:

...

On Thu, Apr 21, 2011 at 4:48 AM, Kevin Horn wrote:

...
cluster****. So he rewrote distutils with an eye on keeping things nice for everyone. Project managers, distro packagers, users installing software, etc. This is distutils2. In Python 3.3 and up it will be called "packaging". Once people start using it, it will make a lot of the current packaging headaches in the Python world go away.

Are you sure about that? Where is the list of stories it will solve when people start using it, so I can check that my cases are covered by distutils2 insurance plan?

[OT] You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues): http://cournape.github.com/Bento/ It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want). cheers, David

Mikhail Terekhov

3:13 a.m.

On Thu, Apr 21, 2011 at 9:01 PM, David wrote:

...

On 04/22/2011 06:21 AM, anatoly techtonik wrote:

...
On Thu, Apr 21, 2011 at 4:48 AM, Kevin Horn wrote:

...
cluster****. So he rewrote distutils with an eye on keeping things nice

for

...
everyone. Project managers, distro packagers, users installing software, etc. This is distutils2. In Python 3.3 and up it will be called "packaging". Once people start using it, it will make a lot of the current packaging headaches in the Python world go away.

Are you sure about that? Where is the list of stories it will solve when people start using it, so I can check that my cases are covered by distutils2 insurance plan?

[OT] You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues):

http://cournape.github.com/Bento/

It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want).

That is nice indeed. But why to invent yet another scripting language for info files or good old python is not good enough? BTW are bento and waf sources included in that 50% reduction?

-- Mikhail Terekhov

David

7:44 a.m.

On 04/22/2011 12:13 PM, Mikhail Terekhov wrote:

...

On Thu, Apr 21, 2011 at 9:01 PM, David mailto:david@silveregg.co.jp> wrote:

On 04/22/2011 06:21 AM, anatoly techtonik wrote: > On Thu, Apr 21, 2011 at 4:48 AM, Kevin Hornmailto:kevin.horn@gmail.com> wrote: >> >> cluster****. So he rewrote distutils with an eye on keeping things nice for >> everyone. Project managers, distro packagers, users installing software, >> etc. This is distutils2. In Python 3.3 and up it will be called >> "packaging". Once people start using it, it will make a lot of the current >> packaging headaches in the Python world go away. > > Are you sure about that? Where is the list of stories it will solve > when people start using it, so I can check that my cases are covered > by distutils2 insurance plan?

[OT] You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues):

http://cournape.github.com/Bento/

It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want).

That is nice indeed. But why to invent yet another scripting language for info files or good old python is not good enough?

The point is to have a mini DSL which is near static, so that it can safely be analysed server-side. *IF* you need more power (like numpy), then there is the notion of hook file which are straight python files (no restriction, except it has to be under the control of bento). Note that languages which are arguably more powerful than python, like Haskell, use the same thing: Cabal, the "haskell distutils" uses the same format. Actually, I shamelessly copied their format for bento.info

...

BTW are bento and waf sources included in that 50% reduction?

Waf, no, but bento+bento script is smaller than numpy.distutils+setup.py. If you count distutils itself, then I would not be surprised that waf+bento+bento script < distutils+numpy.distutils+setup.py, but that would need checking. And bento/waf have clear boundaries (different project, different maintainers, different histories), wheras numpy.distutils/distutils definitly does not have that. cheers, David

Glyph Lefkowitz

6:03 a.m.

On Apr 21, 2011, at 9:01 PM, David wrote:

...

You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues):

http://cournape.github.com/Bento/

It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want).

This looks very interesting. You kind of bury the lead on that web page though, so let me excerpt it for those who saw the first line, thought "oh, it's yet another python packaging thing" and stopped reading in disgust: Even better, bento has a distutils compatibiliy layer so that you can write a simple setup.py which works under pip or easy_install. (emphasis mine) This suggests that Twisted could actually switch to Bento without creating a massive disruption for our users who want to install it with an existing automation tool - which, frankly, is the main use-case for distutils at this point. Would you recommend that we do this? Would there be a benefit? I like the part where you said "near 50% reduction in LOC" quite a lot but I'm sure there are issues that would come along with it.

David

9:03 a.m.

On 04/22/2011 03:03 PM, Glyph Lefkowitz wrote:

...

On Apr 21, 2011, at 9:01 PM, David wrote:

...
You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues):

http://cournape.github.com/Bento/

It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want).

This looks /very/ interesting. You kind of bury the lead on that web page though, so let me excerpt it for those who saw the first line, thought "oh, it's yet another python packaging thing" and stopped reading in disgust:

Even better, bento has a distutils compatibiliy layer http://cournape.github.com/Bento/html/transition.html so that you can write a simple setup.py *which works under pip or easy_install*.

(emphasis mine)

This suggests that Twisted could actually /switch/ to Bento without creating a massive disruption for our users who want to install it with an existing automation tool - which, frankly, is the main use-case for distutils at this point.

Would you recommend that we do this? Would there be a benefit? I like the part where you said "near 50% reduction in LOC" quite a lot but I'm sure there are issues that would come along with it.

Actually, twisted is on my list of packages to convert to bento to get a feeling of what's missing in bento :) To get an actual idea of what it is looking ATM, you can see here: https://github.com/cournape/numpy/tree/bento_waf_build (bento.info and bscript - especially numpy/core/bscript). The reason why the distutils compatibility thing is not emphasized is because I cannot possibly support what makes bento interesting in my mind under this mode: out of tree builds, hooks support, recursive description, pluggable build backend, etc... After all, if I could support what I have in mind with distutils, I would have started from distutils and not from scratch (I already did in a former life, and bento is born out of that failure). Now, concerning the use of pip/easy_install: if those tools's author were willing to add hook to support additional tool, it would not take much. Fundamentally, you only need to say "bentomaker install" instead of "python setup.py install", plus all the dirty details. This will be needed anyway with the distutils2 effort, since they have also started using a static format description and python setup.py install will not work anymore (IIRC). As for what you would gain doing so: - robust recursive support (things like relativeTo as used in twisted are inherently fragile once you don't want to assume source tree == current directory) - automatic dependency handling, easy customization and parallel support for compiled code (waf automatically scan sources to find header dependencies - waf has its quircks, but it has been recently used by SAMBA, which is a pretty good endorsement in my mind as far as complex builds go) - a simple and robust way to install data files (install things in chroots will finally be possible, a pet-peeve of mine when deploying twisted apps) - it would help me evangelizing bento :) As you mentioned, most people say "sigh, another packaging python thing". I have not found a good angle to quickly describe what I am doing, because it is more about the how than the what. Maybe you could say that bento is trying the "pylons approach" of reusing existing tools, whereas distutils is more of the "django approach". For the scipy community (where I am coming from), the advantages of using a real build tool with dependency handling is obvious, but many people don't care about that. Note that bento is currently a moving target, but I hope to be close to a first alpha in a couple of months (bento development has started in december 2009). I am careful with timing because I don't want to reproduce the precedent of setuptools which became popular but with issues that became too costly to fix afterwards. cheers, David

anatoly techtonik

6:41 a.m.

On Fri, Apr 22, 2011 at 4:01 AM, David wrote:

...

On 04/22/2011 06:21 AM, anatoly techtonik wrote:

...
On Thu, Apr 21, 2011 at 4:48 AM, Kevin Horn wrote:

...
cluster****. So he rewrote distutils with an eye on keeping things nice for everyone. Project managers, distro packagers, users installing software, etc. This is distutils2. In Python 3.3 and up it will be called "packaging". Once people start using it, it will make a lot of the current packaging headaches in the Python world go away.

Are you sure about that? Where is the list of stories it will solve when people start using it, so I can check that my cases are covered by distutils2 insurance plan?

[OT] You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues):

http://cournape.github.com/Bento/

It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want).

[OT] I still can't see how it solves even the basic user story - 'i want to uninstall twisted' or 'i want two versions of twisted installed'. Absolute paths in examples won't work on Windows, hardcoded version field in .info file is inconvenient. It looks like yet another pip, distribute or easy_install. Don't get me wrong - it looks better - but for yet another nih packaging solution there should be some convincing facts or use cases (examples) of why this particular solution is better. In fact, I'd like to see Wikipedia like comparison for different 'packaging solutions' somewhere at http://wiki.python.org/moin/Packaging because I don't use anything except, well, easy_install, which can't even install protocol buffers. I guess bento can? -- anatoly t.

David

7:51 a.m.

On 04/22/2011 03:41 PM, anatoly techtonik wrote:

...

On Fri, Apr 22, 2011 at 4:01 AM, David wrote:

...
On 04/22/2011 06:21 AM, anatoly techtonik wrote:

...
On Thu, Apr 21, 2011 at 4:48 AM, Kevin Horn wrote:

...
cluster****. So he rewrote distutils with an eye on keeping things nice for everyone. Project managers, distro packagers, users installing software, etc. This is distutils2. In Python 3.3 and up it will be called "packaging". Once people start using it, it will make a lot of the current packaging headaches in the Python world go away.

Are you sure about that? Where is the list of stories it will solve when people start using it, so I can check that my cases are covered by distutils2 insurance plan?

[OT] You can take a look at bento, which is my own response to the distutils issues we have in the scipy community (but I would expect twisted and most big python libraries to have similar issues):

http://cournape.github.com/Bento/

It is designed from the ground up with the idea of reliable customization and complex build supports. It can already build numpy and scipy with a near 50 % reduction in LOC compared to our setup.py, and more reliably thanks to using a real build tool in the backend (waf, but you can add support for a different one if you want).

[OT] I still can't see how it solves even the basic user story - 'i want to uninstall twisted' or 'i want two versions of twisted installed'.

Bento's point is: make packagers life easier (without making life of users more miserable), so that you are more likely than before to be able to use the native tools. People who are happy installing from sources will not be disrupted, and people like me who hate source install and love linux packaging (or windows .msi for that matter) can actually build those without going insane trying to understand distutils.

...

Absolute paths in examples won't work on Windows

Of course they do - like in distutils, I translate them inside bento so that everything works on any platform. cheers, David

anatoly techtonik

3:02 p.m.

On Fri, Apr 22, 2011 at 10:51 AM, David wrote:

...

...
[OT] I still can't see how it solves even the basic user story - 'i want to uninstall twisted' or 'i want two versions of twisted installed'.

Bento's point is: make packagers life easier (without making life of users more miserable), so that you are more likely than before to be able to use the native tools.

You do an awesome thing, but it will be a total waste of time if Fellowship of the Packaging fails to make life of Python users better. I am sure their ranks will appreciate your experience in this area. ;) -- anatoly t.

4752

Age (days ago)

4767

Last active (days ago)

List overview

Download

34 comments

15 participants

participants (15)

anatoly techtonik
Andrew Bennetts
David
exarkun＠twistedmatrix.com
Glyph Lefkowitz
Itamar Turner-Trauring
Jason Rennie
Johan Rydberg
Kevin Horn
Marcin Kasperski
Mikhail Terekhov
Phil Christensen
Stephen Thorne
Tim Allen
Tim Allen

[Twisted-Python] Twisted Plugins - Implementation Discussion

David

Tim Allen

David

Jason Rennie

Johan Rydberg

Marcin Kasperski

David

David

David

David

tags

participants (15)