[Twisted-Python] Twisted Euphrates (Modified by Glyph Lefkowitz)
Unfortunately, I've been out of the loop on Twisted development for some time. During the course of the PyCon sprints, I've been catching up with what's been going on and trying to do some planning for future directions in development. Yesterday Itamar and I had our quarterly screaming argument about obscure details, and it didn't come to blows, which I believe means that spring is right around the corner! The major issue we've been discussing is splitting up Twisted into multiple packages. Twisted has been encountering some growing pains, due to a mismatch between the way it's developed and the way it's distributed. We (T.M.Labs) are really a community of developers working on several interrelated projects with different goals, stability, and release cycles, but we are distributing one monolithic system, with code of vastly different qualities in the same tarball and sometimes even in the same package. The primary problem that this leads to is revision coupling. This is probably old news to anyone trying to deploy a large system which relies on any part of Twisted besides the base reactor (and maybe twisted.web). Scenarios like this happen all the time: - Twisted v.N.M.0 is released. - Project P uses twisted components X and Y. (Generally X and Y are marked 'Unstable' or 'Semi-Stable', but are very useful or heavily marketed or otherwise convince unsuspecting users to trust their stability.) - Project Q uses twisted components Y and Z. - Project P v. 1 is released. - Project Q v. 1 is released. - Project P gives useful feedback to twisted developer A who then fixes several bugs in component X which are critical to the functioning of the as-yet-unreleased P v.2 - Twisted developer B working on project Q commits substantial "improvements" to component Y, which change the functionality in a way that breaks project X. - Radix prepares an alpha/bugfix release, v.N.M.1 At this point, Project Q is perfectly happy, because the most recent alpha works for their code. However, Project P is very unhappy, because there is no release which includes their bugfix but does NOT include code which forces them to upgrade some functionality in an a way which is incompatible with the previous micro-version release of Twisted which their users presumably already have installed. The release notes have to be very specific, and because nobody reads them anyway, a lot of confusion results. While there are problems with splitting Twisted into multiple projects, I think that the time has come to do so. Originally the feedback I heard from most users was that it was great to have lots of different packages available so readily; the "batteries included" philosophy does have a lot going for it. Now, most of what I'm hearing is complaints about the above scenario, not the least of which from my own employer :-). We are more often in the role of project Q than project P (although we have been in both more than once) and I'd prefer not to have a few projects who employ active Twisted developers control the release cycle based on their requirements. Because splitting the project involves moving some code around, we should take the opportunity to clean up another problem: our code has no effective separation between interface and implementation. Within the project this is fine, because we can easily refactor any uses of deprecated APIs. While we aren't going to break the existing package-structure interfaces (or even deprecate them, for the time being), this means that in the future, imports for twisted projects will probably look like: from twisted_core import reactor from twisted_web import Resource The way this will be implemented, a top-level "twisted_core" module (or possibly, package, but I would like as little nesting as possible here) will import explicitly public names from a _twisted_core implementation package. For the forseeable future, the twisted.* package hierarchy will continue to work for imports; however, considering how informally specified this interface has been, I imagine improved documentation will emerge for the newer interface packages. Personally I wanted to call these modules "twisted_web_1", and I also wanted to talk about versioning here, but amidst strenuous objections from Anthony Baxter and Itamar ST I will refrain :). Suffice it to say that I believe that regardless of many potential strategies available for versioning, a separate module for interface vs. implementation will make it possible to determine what is public and usable from a deployed application using the Python interactive prompt or reading the code. Since many packages don't have stability declarations anyway, I think this will result in many fewer unintentional uses of private code. The first package that we are going to try this on is twisted.web, because there is a major rewrite pending integration of all the fixes and redesign that has happened in the Nevow package. If this goes well, other packages will use the same naming conventions. In any case, we will be migrating packages out of the Twisted core. for example: lore, conch, manhole, and mail. As part of this process we hope to improve the website to reflect the name "Twisted" as primarily a part of "Twisted Matrix Labs", the group of developers working on these projects. Similar to tigris.org, TML is "a mid-sized open source community focused on building better tools for networked application development". Package maintainers: each project should have its own sub-site, so consider a project description and any other documentation you might want on that site. As far as our development process, we will be keeping all the code in one subversion repository and still running BuildBot over the entire system in a way similar to how we're doing it now. There are lots of possible improvements to the process, all the way up to an independent package management system using distutils: please, don't recommend any of these. We don't want to make a huge number of changes at the same time, and while this moving / renaming is fairly mechanical, it will touch almost every file in the world; we need to make sure it all works before we begin adding features based on it.
My concern is entirely documentation, so forgive me for glossing over every other detail contained in your post. On Tue, Mar 23, 2004, Glyph Lefkowitz wrote:
As part of this process we hope to improve the website to reflect the name "Twisted" as primarily a part of "Twisted Matrix Labs", the group of developers working on these projects. Similar to tigris.org, TML is "a mid-sized open source community focused on building better tools for networked application development". Package maintainers: each project should have its own sub-site, so consider a project description and any other documentation you might want on that site.
I'm not sure what this implies for the documentation: there isn't a very clear distinction drawn in your post between API documentation and HOWTOs -- perhaps this is as it should be but there are some distinctions for this exercise. It seems fairly uncontroversial to me that API documentation gets split off onto the project subsites, especially since this can be automated. The situation with HOWTOs and other documentation is less clear. The "Twisted documentation project" already has its own separate section of the repository, but there are few internal distinctions between Twisted Web docs and PB docs (for example) aside from in the HOWTO index. Interface changes to one Twisted project would require a partial HOWTO release, which suggests that the HOWTOs should be split as well. However, splitting the HOWTOs into the separate projects leaves the question of any docs that cover the entire family of projects: the tutorial is the single best example. There are also documents that are relevant for developers using many of the products, the documentation on asynchrony is an example. It also leaves open the question of responsibility for keeping those docs up to date. As for me personally, it's unclear whether this leaves my role as documentation editor redundant or not. Certainly, maintaining documents, or coordinating the maintainence thereof, for a whole bunch of projects on a whole bunch of release cycles is a different, and probably bigger, task than maintaining one set of documents for one project. Even the latter task is only notionally mine (the time I can devote to Twisted documentation is only around 5 hours a week). If decisions are made affecting this, please let me know. I'm willing to provide input if you like, but I'm not often on IRC, that powerhouse of Twisted decision-making. -Mary
On Mar 23, 2004, at 7:12 PM, Mary Gardiner wrote:
As for me personally, it's unclear whether this leaves my role as documentation editor redundant or not. Certainly, maintaining documents, or coordinating the maintainence thereof, for a whole bunch of projects on a whole bunch of release cycles is a different, and probably bigger, task than maintaining one set of documents for one project. Even the latter task is only notionally mine (the time I can devote to Twisted documentation is only around 5 hours a week).
If decisions are made affecting this, please let me know. I'm willing to provide input if you like, but I'm not often on IRC, that powerhouse of Twisted decision-making.
Mary, Your contributions have been extremely valuable so far, and my main concern is that as little change as possible in this transition :). Consider that the existing documents form a single book which documents a suite of applications. Some of these applications will be deprecated or disappear, and the parts of the book covering them will be removed. However, at the moment I don't think it's worthwhile to split everything up if that's going to make things harder. If there is a process change to be made, consider this primarily a license for you to solicit more assistance from package maintainers on a regular basis...
On Wed, Mar 24, 2004, Glyph Lefkowitz wrote:
Your contributions have been extremely valuable so far, and my main concern is that as little change as possible in this transition :). Consider that the existing documents form a single book which documents a suite of applications. Some of these applications will be deprecated or disappear, and the parts of the book covering them will be removed.
The only trouble with this is how to keep the HOWTOs up to date with the codebase(s). At present, HOWTOs are updated on the main webpage whenever a major release happens. Bugs are filed against webpage docs at about twice the rate they're filed against repository docs, so clearly the webpage docs are regarded as authoritative, and it is important that it reflects the released codebase. I suppose the /documents section of the website could be updated whenever there is *any* release of a Twisted project. That's probably the smallest possible sensible change.
If there is a process change to be made, consider this primarily a license for you to solicit more assistance from package maintainers on a regular basis...
OK! -Mary
Glyph Lefkowitz wrote:
from twisted_web import Resource
So, now we have twisted.web, twisted_web and twisted-web.deb, also known as TwistedWebDeb? One thing you said relatively little of, but what seems interesting, is what are the groups of modules likely to be? Core, web, mail, conch, ... I'm trying to understand how complex the result will be.
On Mar 25, 2004, at 3:47 AM, Tommi Virtanen wrote:
Glyph Lefkowitz wrote:
from twisted_web import Resource
So, now we have twisted.web, twisted_web and twisted-web.deb, also known as TwistedWebDeb?
twisted.web == twisted_web We really need to get TwistedWebDeb distributed along with the codebase, so that we can have similar running mechanisms for UNIXes (and an obvious empty spot where a win32 mechanism would go). A number of people at PyCon pointed me at pkgutil, which may eliminate the need for the underscore in the package name.
Hi all, I'm working on a chat server-ish kinda thang, and although the learning curve of twisted was a bit much for my last project, I'm back again for more punishment. In this case, I'm looking to broadcast a message from one client to other connected clients. I came up with a solution, but I wanted to run it by you guys to see whether what I'm doing is The Right Way To Do Things. Here's what I did for my simple prototype that takes any message received from a client and sends it to all clients (including the sender): ----------- factory = Factory() factory.transports = [] class Chatterbox(Protocol): def connectionMade(self): # Make a copy of the new connection's transport and # stick it somewhere all the Chatterbox instances can # get to it. self.factory.transports.append(self.transport) def dataReceived(self, data): # Every time a piece of data comes in, just write it back # out to all the transports, ergo, clients. for transport in self.factory.transports: transport.write(data) def connectionLost(self, reason): # When a client disconnects, delete the copy of their # transport. Bet this'll be slow w/ lotso clients. self.factory.transports.remove(self.transport) factory.protocol = Chatterbox reactor.listenTCP(8008, factory) reactor.run() ----------- What do you think? Is this a properly twisted way to do things? Strike anyone as needlessly inefficient? I'm motivated to make it as fast as possible, so despite what Knuth said, any proferred optimizations would be great to hear. (On a related note, I intend to put the for loop into a map, but I haven't yet embarked on language optimizations, and this question's more about Twisted.) Also, I'm obviously, and perhaps incorrectly, assuming that once a transport object is created, it never gets changed. If it does, the copy I made won't be updated -- is this a problem? A definite problem with this method is that it sends the data right back to the sender, which actually isn't desirable for my application. Trouble is, to avoid it I think I'd have to do this... def dataReceived(self, data): for transport in self.factory.transports: if transport != self.transport: transport.write(data) Seems an inefficient way to do things, as I'd like this server to handle thousands of clients. Anyway, any help would be appreciated. :-) Thanks, Steve
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 | Here's what I did for my simple prototype that takes any message | received from a client and sends it to all clients (including the | sender): alternativly, you could use PB and share some sort of 'talker' object between all connections. this would allow a client to simply do: talker.callRemote("write", "hello world") and have everyone else recieve it. I get the feeling the pb/cred combo would be ideal for what your trying to do. Check it out: http://www.twistedmatrix.com/documents/current/howto/pb-intro -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFAZ65oQAUUJNWKvroRAgiCAKCFBpqPOaMVROPos2a6+kGP8Kf+EwCeNyTA s3O/hBbtmSdJtONrM3c88eQ= =BXEp -----END PGP SIGNATURE-----
On Sunday 28 March 2004 9:04 pm, Tim Stebbing wrote:
alternativly, you could use PB and share some sort of 'talker' object between all connections. this would allow a client to simply do:
talker.callRemote("write", "hello world")
and have everyone else recieve it. I get the feeling the pb/cred combo would be ideal for what your trying to do. Check it out: http://www.twistedmatrix.com/documents/current/howto/pb-intro
Unfortunately, I haven't got Python on both ends of the wire here. I've got Twisted/Python on one side, and C++ or similar on the other end. So unless I'm wrong about PB, I don't think it'll apply here. Cred looks useful, though. Thanks! Steve
Glyph Lefkowitz <glyph@divmod.com> writes:
A number of people at PyCon pointed me at pkgutil, which may eliminate the need for the underscore in the package name.
That would be good, as I hate the underscore in the name :-) Why can't the package still be called twisted.web, but be distributed separately (if that's the issue here)? Paul. -- This signature intentionally left blank
On Mar 29, 2004, at 1:44 PM, Paul Moore wrote:
Glyph Lefkowitz <glyph@divmod.com> writes:
A number of people at PyCon pointed me at pkgutil, which may eliminate the need for the underscore in the package name.
That would be good, as I hate the underscore in the name :-) Why can't the package still be called twisted.web, but be distributed separately (if that's the issue here)?
That is precisely what pkgutil allows, if I remember correctly. -bob
On Tue, 2004-03-23 at 14:00, Glyph Lefkowitz wrote:
As far as our development process, we will be keeping all the code in one subversion repository and still running BuildBot over the entire
What does this mean for the licensing of the packages? Will new versions of split-out packages still be copyright to Glyph, or will they move to a developer for that package? -p -- Paul Swartz (o_ z3p at twistedmatrix dot com //\ http://www.twistedmatrix.com/users/z3p.twistd/ V_/_ AIM: Z3Penguin
On Mar 29, 2004, at 1:27 AM, Paul Swartz wrote:
What does this mean for the licensing of the packages? Will new versions of split-out packages still be copyright to Glyph, or will they move to a developer for that package?
I don't know. I'd like to have a centralized copyright structure so that we can move it to a foundation (in the ever-nearer future) but that still involves inconvenient paperwork. Not to mention the fact that copyright is the only enforceable form of attribution, and I think it's unfair that so many major contributors don't really have a signature on Twisted in any significant way. With BSD-licensed software, the possible problems with the license are so small that it seems unlikely that disparate copyrights would cause a problem... I should really talk to a lawyer about this, and to a few key contributors. I have a feeling I already know what some of them are going to say ;). The lawyer I discussed this with last time has moved on, and the other intelprop lawyers I know aren't really in this space - does anyone know a good (cheap ^_^) lawyer with F/OSS experience who has the time to help?
participants (8)
-
Bob Ippolito
-
Glyph Lefkowitz
-
Mary Gardiner
-
Paul Moore
-
Paul Swartz
-
Steve Freitas
-
Tim Stebbing
-
Tommi Virtanen