On Jul 10, 2014, at 12:32 PM, exarkun@twistedmatrix.com wrote:

Hello all,

Some of you may have heard rumors of some work in progress on a replacement for Twisted's IConsumer/IProducer interfaces.

Tubes have been largely Glyph's effort (though a lot of people have contributed in one way or another).  And a large effort it's been. Development is proceeding in a Twisted branch and comes to over three thousand lines of additions so far.

Given the large size of the implementation and the long time that this effort has been underway (I remember the Twisted meetup at the Rackspace offices that *I* attended when I was visiting SF... a year and a half ago... at which point tubes wasn't exactly a brand new project) I'd like to re-raise the idea that the best next step for the project is to see some distribution in its *current* state.

Unfortunately, while I can see a lot of hypothetical benefit to what you're describing, I don't think this is appropriate in this specific case.  With other, superficially similar projects (large new features within Twisted), this might be the right thing to do, with some caveats about how we do new-feature integration that I discuss below.  Some of the aspects of tubes make it extra hard to split out though, despite the fact that it has few immediate dependencies.

The Tubes package implements a new primitive, which means that everything that uses it is going to be very tightly coupled to its precise semantics, and it has very little wiggle-room in terms of evolution once it's been released.

In its current state, Tubes is basically a research project.  Every focused burst of activity on it has resulted in a complete, 100% break in backwards compatibility, on the level of its API, its terminology, and its semantics.  Having a separate release might not imply that there will be a compatibility policy on par with the strictness of Twisted's current one, but it does usually imply some level of support or continued development.  If anyone had written an application against a previous revision of the Tubes branch, it would probably have been broken in the first place and it would definitely not still be working today (nor would it really be possible to port it or evolve it to use the new version without a complete rewrite).

While I currently believe that Tubes's API has firmed up and its current API is suitable for general purpose use, I have believed that at various points in the past as well when it was completely wrong.  This sentiment is very much of the "this time for sure!" variety, and I will not have confidence that it's actually complete until we have made it all the way through the documentation, examples, and testing of a full proof of concept - at which point I believe it would be suitable to include in a Twisted release anyway.

Specifically, I think it would be beneficial to set up a tubes project on Github under the Twisted organization and try for a release in the very near future.

I think this has several advantages over the status quo:

1) As an independent project, tubes will attract more attention than it presently gets as a relatively unknown ticket & branch of Twisted.

I would appreciate attention in the form of code review, commentary, and experimentation.  I would not appreciate attention in the form of actual users, though.  At least, not right now.  Maybe quite soon, though, depending on how the next few development sessions go.

2) As a separate Python package, the logistics of actually using tubes are simpler (just consider how you might declare a dependency on a branch of Twisted - keeping in mind you may want to use tubes in a project that already depends on some version of Twisted).  It may not make sense to say that it is the same quality as Twisted proper right off the bat (on the other hand, it may well - I suspect tubes in its current form actually is a lot higher quality than large sections of Twisted) but that doesn't mean people (not to mention the tubes project) can't benefit from being able to experiment with it.

I would love it if there were a way to release a package in an actually experimental state, and not just have the release of a package implicitly tell people that it's time to put it into production and demand long-term support for it.  Quick sanity check: go run 'pip freeze' in a production virtualenv you're running - what percentage of the version numbers that come back start with a zero?  I will bet a significant amount of money that it's not 0% :-).

As it stands, if you're not willing to use a random outdated branch of Twisted with unknown bugs that may change without warning, you're probably not willing to adopt Tubes yet.

3) Decoupling tubes from Twisted frees tubes from certain of Twisted's policies which are more challenging to follow for the kind of non- trivial, brand new code base that tubes is.  Technically we could just say that these policies don't apply to a tubes package *in* Twisted but this kind of subtle distinction is often lost on users (ie application developers).

In this case, I actually want the twisted compatibility policy to apply.

4) At this point, a normal review of the tubes branch is going to be a problem.  We do not have good tools or mechanisms for dealing with branches this large.  The code in the current tubes branch can just become master of a new project.  Development going forward from this point should continue to follow the feature-branch, small-changes, pre- commit-peer-review process.  But those 3k lines are written already. Short of an extremely expensive effort to break the work up into smaller, self-contained pieces there's simply never going to be a *good* review in the typical style.

This is an issue either way, though.  And I believe that developing outside of Twisted just exacerbates the issue because it provides an opportunity for faster-paced development, which means more development, which means more addition of more lines of code, which means even more stuff to review eventually.

For example, the recently-landed logging branch was developed in Calendar Server first since we didn't want to develop something so central without experimenting with it in a real application first.  But that meant that by the time it landed it was a pretty substantial amount of code with many different features, rather than landing changes incrementally.  This created a massive code review problem, especially since we had no takers on my alternate code-review strategy proposals.  Since the branch evolved somewhat in response to feedback during its transplantation to Twisted and during the code review, we couldn't even plausibly say "this has been used in production" any more, since what landed ended up being different in some important details.  Don't get me wrong; it ended up being better in those details, the code review was totally worthwhile, but it nevertheless substantially lengthened development time.  (While I _very_ much appreciate Ralph heroically reviewing the whole branch by himself at PyCon, that doesn't really point to a scalable strategy for future feature development.)

So I am keenly interested in ways to address this problem rather than to work around it.  If we are going to try to develop new big Twisted features outside of Twisted,  maybe that's a good idea, but then we need a modified code-review policy for accepting those projects back in where they going to be subjected to code-review standards during development rather than in one giant burst at the end.

Additionally, it may turn out that tubes can remain independent indefinitely.  Someday perhaps Twisted would come to depend on it to allow the various protocols and applications implemented in Twisted to benefit from the superior abstractions it provides.  Or maybe once it has undergone a few iterations it will make sense to bring it back to Twisted.  I don't think this needs to be decided now.

Again, for this specific case, I also don't think this would make a lot of sense.  The real benefit of Tubes will not be realized until all the protocols implemented within Twisted have at least a mechanism for integrating with them, if not being implemented using them entirely.  If you are writing a protocol that does websockets on one end and database traffic on the other, and you have a tube for processing something, if your inputs aren't founts and your outputs aren't drains, all the fancy flow-control features just won't work and you'll have minimal performance and robustness benefits.  (You might get some nice architectural benefits internally in your code, but you could get those with any kind of good composition idiom.)

There are downsides, of course.  All of the boring maintenance involved with having a separate project - setting up CI, actually doing the releases, etc.  Perhaps we could find some volunteers to help out with these tasks, though, in exchange for getting some great code out there?

Even if my opinion were inverted on all of the points above, I really want to avoid doing this - I especially want to avoid the part where it somehow doesn't work on Twisted's heterogenous CI system when we try to reintegrate it after apparently working on some other one-version-of-linux CI infrastructure for some time :).

I'm curious what the folks out there who develop applications using Twisted would find to be the easiest path forward.  I'm also curious to hear what Glyph thinks about all this. ;)

Thanks for the reminder that I need to be putting more time into this.  Hopefully I can put as much time into developing it this week as I put into writing this email ;-).

If you're writing this message because you want to use Tubes then the best way to do that would be to help finish developing it.  I'm happy to make some pair-programming appointments during the week - just let me know.  (Although this shrinks the pool of qualified reviewers even further, I'm sure if it's ready and we all start yelling about it, we can find someone to do it.)