[Twisted-Python] Global reactor unit tests in the Twisted test suite
Hello, I'd like for us to decide that we will introduce no new unit tests into Twisted's test suite which use the global reactor. These tests require extra complexity in trial - to support the Deferred, to check for left over event sources (incompletely implemented), and to clean up left over event sources (incompletely implemented). The tests themselves are also complicated by the need to clean up those event sources. The tests only exercise functionality against one reactor at a time which leads to additional challenges for buildbot. If a test is for reactor functionality with multiple implementations, the ReactorBuilder-style tests are better. If a test is for implementation details of a particular reactor, I think the necessary parts of that reactor should just be invoked directly - on a new instance. For anything that's not a test for reactor functionality, no reactor should be involved at all (protocol implementations, etc). I've mentioned these ideas to various people at various times, but they don't seem to be catching on, so I'd like to hear why and come to some conclusion about how we're going to write tests in the future. Jean-Paul
+1. Agree completely, did not realize this was still a point of contention.
On Tue, Nov 1, 2011 at 10:49 AM, Laurens Van Houtven <_@lvh.cc> wrote:
+1.
Agree completely, did not realize this was still a point of contention.
I think the only point of contention would be if you're trying to modify an existing piece of code that uses the global reactor, and you need to write tests for your changes (which are unrelated to the reactor usage itself), it'd add a lot of overhead to getting your change in. I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor. -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/
On Tue, Nov 1, 2011 at 12:14 PM, Phil Mayers
On 01/11/11 15:28, Christopher Armstrong wrote:
I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor.
Why?
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
An insufficiently advanced sense of taste? :P jml
On 01/11/11 16:18, Jonathan Lange wrote:
On Tue, Nov 1, 2011 at 12:14 PM, Phil Mayers
wrote: On 01/11/11 15:28, Christopher Armstrong wrote:
I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor.
Why?
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
An insufficiently advanced sense of taste? :P
Well, it's a good job I've got you guys here to educate me. On some mailing lists, you'd just get a flip reply.
On Tue, Nov 1, 2011 at 12:45 PM, Phil Mayers
On 01/11/11 16:18, Jonathan Lange wrote:
On Tue, Nov 1, 2011 at 12:14 PM, Phil Mayers
wrote: On 01/11/11 15:28, Christopher Armstrong wrote:
I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor.
Why?
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
An insufficiently advanced sense of taste? :P
Well, it's a good job I've got you guys here to educate me. On some mailing lists, you'd just get a flip reply.
Sorry, I thought you were asking us to come up with good reasons for why you find it messy. jml
On 01/11/11 16:58, Itamar Turner-Trauring wrote:
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
1. Supporting multiple reactors.
Interesting. I had assumed that was a "never on the cards" option. Would it be possible to "nest" the reactors, or would they have to run in separate threads? Or is this speculative?
2. Unit testing: if you have an explicit object, you can replace it more easily with a fake.
Ok. I realise it's a bit late in the day at this point, but I do wonder if something like: def method(arg, reactor=DefaultReactor): ... ...with DefaultReactor being some kind of lazy singleton or weak reference might not have been a more friendly pattern to API users. As it is, I might bind a keystroke to "reactor, " ;o)
On Tue, Nov 1, 2011 at 10:06 AM, Phil Mayers
On 01/11/11 16:58, Itamar Turner-Trauring wrote:
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
1. Supporting multiple reactors.
Interesting. I had assumed that was a "never on the cards" option. Would it be possible to "nest" the reactors, or would they have to run in separate threads? Or is this speculative?
2. Unit testing: if you have an explicit object, you can replace it more easily with a fake.
Ok. I realise it's a bit late in the day at this point, but I do wonder if something like:
def method(arg, reactor=DefaultReactor): ...
...with DefaultReactor being some kind of lazy singleton or weak reference might not have been a more friendly pattern to API users.
As it is, I might bind a keystroke to "reactor, " ;o)
Singletons are an anti-pattern. Because the reactor is a singleton:
* We can't have multiple reactors instantiated
* We can't restart reactors
* We can't isolate certain reactor-dependent unit tests
* We have to install reactors, which requires careful ordering of imports
In Bravo, Exocet is used to set up a pseudo-sandbox. If reactor were
not a singleton, I could isolate people's usage of Twisted calls, and
force them to proxy everything. Instead, I have to grant them access
to the reactor, because it's shared global state. Is this horribly
important? No, but it does prohibit more effective sandboxing and
isolation.
Exocet needs to keep a list of modules which are so fucked that they
must be allowed to touch sys.modules to work correctly. There are two
things in that list: os.path and twisted.internet.reactor. Also, the
reactor only works if it's imported before Exocet. While Exocet's not
exactly a bastion of sanity, this doesn't speak well of the magic that
the reactor goes through in order to advertise itself globally.
Those are my reasons. They're not superb reasons from a
getting-shit-done point of view, but from a reducing-cyclopean-horrors
point of view, I think they're worth considering.
~ C.
--
When the facts change, I change my mind. What do you do, sir? ~ Keynes
Corbin Simpson
Hello everebody. May be somebody knows open source chat app for twisted web? Possible not the complete app, but just skeleton. Possible some snippets, pieces... Will appreciate any suggestions. Thanks.
On Nov 1, 2011, at 3:57 PM, Alexey Ganzha wrote:
Hello everebody. May be somebody knows open source chat app for twisted web? Possible not the complete app, but just skeleton. Possible some snippets, pieces... Will appreciate any suggestions.
There is a tutorial on how to build the a chat server using Nevow here: http://divmodsphinx.funsize.net/nevow/chattutorial/index.html#the-chat-tutor... -glyph
Thank You very much! On 11/02/2011 01:24 AM, Glyph wrote:
On Nov 1, 2011, at 3:57 PM, Alexey Ganzha wrote:
Hello everebody. May be somebody knows open source chat app for twisted web? Possible not the complete app, but just skeleton. Possible some snippets, pieces... Will appreciate any suggestions. There is a tutorial on how to build the a chat server using Nevow here:
http://divmodsphinx.funsize.net/nevow/chattutorial/index.html#the-chat-tutor...
-glyph _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Hi! Thanks again, Glyph for the link. It took me less an hour to embed chat app in existent site (except that nevow is hosted on lanuchpad now, instead of divmod.org). It is ready to use, just according to the tutorial. Just add some js and css to chatthing package to skin athena app. One problem - i had to use it as a separate app. My own app is a raw twisted.web. I saw OldResourceAdapter in Nevows code. Possible it need to write my own NewResourceAdapter for twisted.web :) Any suggestions everybody? May be somebody had already implemented that (embed Nevow as a t.w.Resource)? Thanks again for Your help, guys. Twisted is beautiful. On 11/02/2011 01:24 AM, Glyph wrote:
On Nov 1, 2011, at 3:57 PM, Alexey Ganzha wrote:
Hello everebody. May be somebody knows open source chat app for twisted web? Possible not the complete app, but just skeleton. Possible some snippets, pieces... Will appreciate any suggestions. There is a tutorial on how to build the a chat server using Nevow here:
http://divmodsphinx.funsize.net/nevow/chattutorial/index.html#the-chat-tutor...
-glyph _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On Nov 2, 2011, at 2:34 PM, Alexey Ganzha wrote:
Hi! Thanks again, Glyph for the link. It took me less an hour to embed chat app in existent site (except that nevow is hosted on lanuchpad now, instead of divmod.org). It is ready to use, just according to the tutorial. Just add some js and css to chatthing package to skin athena app.
Glad to hear it!
One problem - i had to use it as a separate app. My own app is a raw twisted.web. I saw OldResourceAdapter in Nevows code. Possible it need to write my own NewResourceAdapter for twisted.web :)
Any suggestions everybody? May be somebody had already implemented that (embed Nevow as a t.w.Resource)?
I don't know if anyone's done it at the API level - there are some limitations in twisted web that might make it problematic. But, you can always easily set up a reverse proxy so that your nevow server can live in the same URL structure as the rest of your Twisted app!
Thanks again for Your help, guys. Twisted is beautiful.
Thanks for saying so :). -glyph
On Nov 1, 2011, at 12:14 PM, Phil Mayers wrote:
On 01/11/11 15:28, Christopher Armstrong wrote:
I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor.
Why?
Sometimes you want a different reactor. The most common reason is unit testing, although if we could successfully eliminate the global reactor everywhere, there are other things that we might be able to do: slowing down or speeding up time from a protocol's perspective (by replacing IReactorTime), grouping related objects together in reactors that can be shut down together (so that reactor.stop() doesn't actually end the process); or, similarly, suspending a group of related objects (removeReader()/removeWriter() on everything) so that they can be inspected for debugging purposes, without suspending the thing doing the inspecting (a manhole python prompt).
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
This pattern is a solution to the problem, but I agree that it is possibly not the optimal solution. It sort of points in a direction where every possible module that might be imported becomes an argument to your function. After all, there are plenty of other modules which have to be mocked for testing, why not just make everyone's __init__ method take sys.modules as an argument too, and never import anything? In more complex systems this can definitely turn into a bit of a mess. Nevertheless, "real reactor as default argument" is not a huge improvement either, because it typically breaks one level out. If you have 'a(reactor=defaultReactor)' and then 'b()' needs to call 'a', half the time 'b' will forget to supply a reactor argument and call 'a()' passing nothing, because that appears to be the suggested API. Any code that calls 'b()' then needs to deal with the fact that the global reactor is going to get used, even if it has a cleanly encapsulated parameter of its own. With sufficient discipline, of course, this approach is equivalent. So despite its imperfections, I don't have a solution better than "pass the reactor to everything" yet. It seems to be better than the alternatives in almost every case. The one place it's not better is when writing brief example scripts, but I suspect this is only the case because examples have to expose the semantic conflict between the way you actually get the reactor (import the singleton) and the way that you want to deal with the reactor (as a parameter). If we had a sensible top-level way of getting the reactor I think examples which didn't use the global would read much more cleanly. If you can think of a better solution that addresses all of these concerns simultaneously somehow, please share, I'd love to hear it :-). -glyph
On 11/01/2011 06:48 PM, Glyph wrote:
Nevertheless, "real reactor as default argument" is not a huge improvement either, because it typically breaks one level out. If you have 'a(reactor=defaultReactor)' and then 'b()' needs to call 'a', half the time 'b' will forget to supply a reactor argument and call 'a()' passing nothing, because that appears to be the suggested
Darn. I hadn't thought of that. You could always walk back up the call stack looking for the implemenation of IReactorXXX that you want... what? Stop looking at me like that! I kid of course... the right answer is to make everything a method on the reactor! No wait...
So despite its imperfections, I don't have a solution better than "pass the reactor to everything" yet. It seems to be better than the alternatives in almost every case. The one place it's not better is when writing brief example scripts, but I suspect this is only the
Well, not only brief example scripts; brief actual scripts. One of the nice things about Twisted is that, with a bit of familiarity, you can knock up some really fast & powerful event-based network clients for ad-hoc jobs. One example, we use a tiny script wrapping DeferredSemaphore and t.p.utils.getProcessOutput to parallelise SSH scanning, because ssh-keyscan is (ahem) poorly implemented. Maybe I knock up more of these short 20-liners than is common? And typing "reactor, " a lot in them seems a bit, well... retrograde. But you're right. Default "reactor=X" parameters will just get passed wrong, dammit...
If you can think of a better solution that addresses all of these concerns simultaneously somehow, please share, I'd love to hear it :-).
Hmpf. I suspect you've covered most of the ground. If anything springs to mind I'll be sure to speak up! Cheers, Phil
On Nov 1, 2011, at 4:36 PM, Phil Mayers wrote:
One example, we use a tiny script wrapping DeferredSemaphore and t.p.utils.getProcessOutput to parallelise SSH scanning, because ssh-keyscan is (ahem) poorly implemented.
Maybe I knock up more of these short 20-liners than is common? And typing "reactor, " a lot in them seems a bit, well... retrograde.
Hopefully you only need to pass it in at the top level to a few things, not everywhere to everything :). For example, with something like twisted.web.client.Agent, you pass the reactor once and then you have an object you can call methods on, you don't need to keep repeating it. If getProcessOutput were a method of a more helpful object, perhaps you'd be able to type 'reactor' a bit less. Another possible eventual improvement here is that Twisted include a script-running tool which creates the reactor for you and passes it, either to a specified function or as a magic global in a script. Then you should just pass 'reactor' to the top-level stuff in your script. Sort of like 'twistd' but for one-off things where you don't want to go to the trouble to build a plugin or even a Service. As a bonus, such a tool could call 'reactor.run()' for you automatically, eliminating some newbie confusion about when it's appropriate to do that. (And ideally, 'reactor.stop' when some appropriate Deferred is fired, but that gets into more thinking than I want to do about this right now...) -glyph
On Tue, Nov 1, 2011 at 20:48, Glyph
On Nov 1, 2011, at 12:14 PM, Phil Mayers wrote:
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
This pattern is a solution to the problem, but I agree that it is possibly not the optimal solution. It sort of points in a direction where every possible module that might be imported becomes an argument to your function. After all, there are plenty of other modules which have to be mocked for testing, why not just make everyone's __init__ method take sys.modules as an argument too, and never import anything? In more complex systems this can definitely turn into a bit of a mess.
I've often wondered whether there is a *real* use of the reactor argument other than unit testing. I haven't had other uses for it, and haven't seen (or perhaps not understood) real application code making use of it. Now, I've sometimes made an argument for unit testing, other than the obvious quality assurance, that unit tested code will be more reusable and have better interfaces as it already is used for at least two things: the real application and the unit test. However, if accommodating unit testing requires sacrificing the natural interface, then that kind of takes the edge off that argument. Thanks to everybody for the discussion and input. Very informative! -- Anton
On Tue, Nov 1, 2011 at 5:29 PM, Anton Gyllenberg
On Nov 1, 2011, at 12:14 PM, Phil Mayers wrote:
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
This pattern is a solution to the problem, but I agree that it is
On Tue, Nov 1, 2011 at 20:48, Glyph
wrote: possibly not the optimal solution. It sort of points in a direction where every possible module that might be imported becomes an argument to your function. After all, there are plenty of other modules which have to be mocked for testing, why not just make everyone's __init__ method take sys.modules as an argument too, and never import anything? In more complex systems this can definitely turn into a bit of a mess. I've often wondered whether there is a *real* use of the reactor argument other than unit testing. I haven't had other uses for it, and haven't seen (or perhaps not understood) real application code making use of it. Now, I've sometimes made an argument for unit testing, other than the obvious quality assurance, that unit tested code will be more reusable and have better interfaces as it already is used for at least two things: the real application and the unit test. However, if accommodating unit testing requires sacrificing the natural interface, then that kind of takes the edge off that argument.
Thanks to everybody for the discussion and input. Very informative!
Here's the thing: not only does removing the global reactor make *our* unit tests a lot nicer, it makes the unit tests of our users much easier to write, since they will want to use our APIs in their unit tests in a way that doesn't require ugly global hacking. I don't know how many times I have raged at some crappy library whose code is incompatible with consumer unit testing. Having testable interfaces is a really valuable feature we can provide to our users. -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/
On 09:37 pm, radix@twistedmatrix.com wrote:
On Nov 1, 2011, at 12:14 PM, Phil Mayers wrote:
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
This pattern is a solution to the problem, but I agree that it is
On Tue, Nov 1, 2011 at 20:48, Glyph
wrote: possibly not the optimal solution. It sort of points in a direction where every possible module that might be imported becomes an argument to your function. After all, there are plenty of other modules which have to be mocked for testing, why not just make everyone's __init__ method take sys.modules as an argument too, and never import anything? In more complex systems this can definitely turn into a bit of a mess. I've often wondered whether there is a *real* use of the reactor argument other than unit testing. I haven't had other uses for it, and haven't seen (or perhaps not understood) real application code making use of it. Now, I've sometimes made an argument for unit testing, other than the obvious quality assurance, that unit tested code will be more reusable and have better interfaces as it already is used for at least two things: the real application and the unit test. However, if accommodating unit testing requires sacrificing the natural interface, then that kind of takes the edge off that argument.
Thanks to everybody for the discussion and input. Very informative! Here's the thing: not only does removing the global reactor make *our* unit tests a lot nicer, it makes the unit tests of our users much easier to write, since they will want to use our APIs in their unit tests in a way
On Tue, Nov 1, 2011 at 5:29 PM, Anton Gyllenberg
wrote: that doesn't require ugly global hacking. I don't know how many times I have raged at some crappy library whose code is incompatible with consumer unit testing. Having testable interfaces is a really valuable feature we can provide to our users.
To emphasize, having testable interfaces is a *really* *really* valuable feature we can provide to our users. If you're a Twisted user and you're not taking advantage of it, you're missing out on a lot. Jean-Paul
On Tue, Nov 01, 2011 at 02:48:02PM -0400, Glyph wrote:
If you can think of a better solution that addresses all of these concerns simultaneously somehow, please share, I'd love to hear it :-).
I'm not sure if it addresses all your concerns, but twisted.python.context will let you set a particular value for things you call and all their descendants (unless one of them sets a new value in the context). I can imagine interleaving code that passes a reactor parameter explicitly and code that grabs a reactor from the current context without much hassle. Tim.
On Nov 1, 2011, at 10:20 PM, Tim Allen wrote:
On Tue, Nov 01, 2011 at 02:48:02PM -0400, Glyph wrote:
If you can think of a better solution that addresses all of these concerns simultaneously somehow, please share, I'd love to hear it :-).
I'm not sure if it addresses all your concerns, but twisted.python.context will let you set a particular value for things you call and all their descendants (unless one of them sets a new value in the context). I can imagine interleaving code that passes a reactor parameter explicitly and code that grabs a reactor from the current context without much hassle.
I invented twisted.python.context for this very reason! But... I find the idea of encouraging its widespread use disconcerting. There are a couple of issues (and if you search the web for "emacs lisp dynamic scope" you will find a whole bunch of them) but the main problem is this: When you do context.get(something) and something isn't there... what do you do? Whose fault is it? As a parallel example, when a function gets called with the wrong argument, it's quite easy to figure out where the problem is: you just look at the caller, and the caller almost certainly knows who they are. The caller somehow has to know about the signature of the callee and if they get it wrong it's an immediate and obvious runtime error. But when code has a relationship to its caller's caller's caller's caller's caller, and there may be any amount of indirection in there (callbacks registered on Deferreds, for example) how do you find the place that really should have been the one to pass the result? How do you know at the time you invoke something, all the context that it may require? The immediate place I can think of where this would cause issues is with GUI callbacks. Right now Twisted goes out of its way to make sure that you can '.connect("clicked", ...)' on a GTK widget and use any old python callback you want. If we made the reactor context-driven, then that would work... mostly. If Twisted's callbacks were run with the reactor present, then the GUI's callbacks might not have the reactor available to them because they lack some setup. Of course you can probably fix that problem, but there are other places where it will crop up again; the standard idiom is to call a bunch of functions to set up event handlers, then call reactor.run(). But what if the reactor isn't available to those functions before it's run? Et cetera, et cetera. If you can think of a solid, robust way to specify contracts that describe what context is expected, by whom, and how it can be filled at the right time without surprising consequences, I'd love to hear it, but I won't hold my breath :). -glyph
On Tue, Nov 1, 2011 at 6:48 PM, Glyph
On Nov 1, 2011, at 12:14 PM, Phil Mayers wrote:
On 01/11/11 15:28, Christopher Armstrong wrote:
I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor.
Why?
Sometimes you want a different reactor. The most common reason is unit testing, although if we could successfully eliminate the global reactor everywhere, there are other things that we might be able to do: slowing down or speeding up time from a protocol's perspective (by replacing IReactorTime), grouping related objects together in reactors that can be shut down together (so that reactor.stop() doesn't actually end the process); or, similarly, suspending a group of related objects (removeReader()/removeWriter() on everything) so that they can be inspected for debugging purposes, without suspending the thing doing the inspecting (a manhole python prompt).
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
This pattern is a solution to the problem, but I agree that it is possibly not the optimal solution. It sort of points in a direction where every possible module that might be imported becomes an argument to your function. After all, there are plenty of other modules which have to be mocked for testing, why not just make everyone's __init__ method take sys.modules as an argument too, and never import anything? In more complex systems this can definitely turn into a bit of a mess.
Nevertheless, "real reactor as default argument" is not a huge improvement either, because it typically breaks one level out. If you have 'a(reactor=defaultReactor)' and then 'b()' needs to call 'a', half the time 'b' will forget to supply a reactor argument and call 'a()' passing nothing, because that appears to be the suggested API. Any code that calls 'b()' then needs to deal with the fact that the global reactor is going to get used, even if it has a cleanly encapsulated parameter of its own. With sufficient discipline, of course, this approach is equivalent. ... If you can think of a better solution that addresses all of these concerns simultaneously somehow, please share, I'd love to hear it :-).
reactor = getUtility(IReactor)? jml
On 06:06 pm, jml@mumak.net wrote:
On Tue, Nov 1, 2011 at 6:48 PM, Glyph
wrote: On Nov 1, 2011, at 12:14 PM, Phil Mayers wrote:
On 01/11/11 15:28, Christopher Armstrong wrote:
I whole-heartedly agree with the sentiment, though. We need to get rid of the global reactor.
Why?
Sometimes you want a different reactor. �The most common reason is unit testing, although if we could successfully eliminate the global reactor everywhere, there are other things that we might be able to do: slowing down or speeding up time from a protocol's perspective (by replacing IReactorTime), grouping related objects together in reactors that can be shut down together (so that reactor.stop() doesn't actually end the process); or, similarly, suspending a group of related objects (removeReader()/removeWriter() on everything) so that they can be inspected for debugging purposes, without suspending the thing doing the inspecting (a manhole python prompt).
I find the "pass reactor as 1st argument to everything" API pattern messy. I'm sure there's a good reason. What is it?
This pattern is a solution to the problem, but I agree that it is possibly not the optimal solution. �It sort of points in a direction where every possible module that might be imported becomes an argument to your function. �After all, there are plenty of other modules which have to be mocked for testing, why not just make everyone's __init__ method take sys.modules as an argument too, and never import anything? In more complex systems this can definitely turn into a bit of a mess.
Nevertheless, "real reactor as default argument" is not a huge improvement either, because it typically breaks one level out. �If you have 'a(reactor=defaultReactor)' and then 'b()' needs to call 'a', half the time 'b' will forget to supply a reactor argument and call 'a()' passing nothing, because that appears to be the suggested API. Any code that calls 'b()' then needs to deal with the fact that the global reactor is going to get used, even if it has a cleanly encapsulated parameter of its own. �With sufficient discipline, of course, this approach is equivalent. ... If you can think of a better solution that addresses all of these concerns simultaneously somehow, please share, I'd love to hear it :-).
reactor = getUtility(IReactor)?
That's not really a solution. It's the barest glimpse of a large system which might be applied as a solution. Do you want to expand it? Jean-Paul
On Nov 7, 2011, at 10:25 AM, exarkun@twistedmatrix.com wrote:
reactor = getUtility(IReactor)?
That's not really a solution. It's the barest glimpse of a large system which might be applied as a solution. Do you want to expand it?
I'd like to echo this request for exposition. My understanding is that 'getUtility' is just a baroque way to access a global variable, with most of the disadvantages of said globalness intact. What does it provide that our existing import strategy doesn't?
On Tue, Nov 8, 2011 at 5:44 AM, Glyph Lefkowitz
On Nov 7, 2011, at 10:25 AM, exarkun@twistedmatrix.com wrote:
reactor = getUtility(IReactor)?
That's not really a solution. It's the barest glimpse of a large system which might be applied as a solution. Do you want to expand it?
I'd like to echo this request for exposition. My understanding is that 'getUtility' is just a baroque way to access a global variable, with most of the disadvantages of said globalness intact. What does it provide that our existing import strategy doesn't?
An extra layer of abstraction and XML-based configuration – what's not to love? It could allow for a clearer expression of dependency. Something that only uses IReactorTime would only get the utility for that. This would in turn make it easier to swap out only parts for testing, or to do things like "speed up or slow down time from a protocol's perspective". However, you're right. Ultimately, it's a process-wide configuration setting. That means that setting it is a slightly politer form of monkey patching and that it's not fine-grained enough to do the other things you suggested (e.g. "suspending a group of related objects"). jml
On Nov 1, 2011, at 10:38 AM, exarkun@twistedmatrix.com wrote:
Hello,
I'd like for us to decide that we will introduce no new unit tests into Twisted's test suite which use the global reactor.
+1. Let's add this to the coding standard. It doesn't look to me like anyone's objecting. I think it might be worth considering a grandfather clause. If we have an outstanding patch or branch which is otherwise close to being merged, whose only problem is real-reactor tests, I think it might not be worth it to block its landing at this point. Instead, we could file a new ticket to fix its tests separately. I don't know of any such outstanding tickets though, so hopefully this is a baseless concern. And I'm happy to be overruled on this point, as well, if you feel strongly about it :).
These tests require extra complexity in trial - to support the Deferred, to check for left over event sources (incompletely implemented), and to clean up left over event sources (incompletely implemented).
It might be useful to have a switch in trial that reports tests which return a Deferred, to make it easy to spot violators of this policy within Twisted. I hope you're not suggesting getting rid of the feature _entirely_ from trial, though. The first step to doing so would be to provide a totally comprehensive, complete, documented, maintained test fake for every interface supplied by every reactor... and then some. I know that there are hundreds of tests for applications which _use_ Twisted that I've written which would be mind-bendingly difficult to rewrite using even a fake reactor. For one simple example, on more than one occasion I've had a high-level abstraction wrapping a proprietary C program with spawnProcess and I wanted to test high-level logic that dealt with its results in specific cases; I could of course try to test everything using fakes, but it was actually harder to write the fakes than to set up the state for the real system, and the tests were significantly higher-value as regression tests with the real components integrated. (In some cases it's a proprietary C extension module which needs to be run via deferToThread.) That said, there are some caveats: none of these considerations should ever apply to Twisted itself; if they do seem to apply, then it's probably time to split that code out into a separate project. Conversely, third-party projects should be able to implicitly rely on the reactor working as expected, and therefore don't have any concerns about testing the reactor itself. If we could publicly expose the ReactorBuilder testing facility, and make it build only a single reactor by default, then it might be possible to test code like I described by spinning up a new reactor for each thing... but in some cases, there's an expensive subprocess that I really want multiple tests to share. Would it be possible to move the associated file descriptors between reactors? (On a related note, I wonder, how am I ever going to get those tests to work in disttrial...) To summarize my own position about Trial, Deferreds, and real-reactor testing in general: Whenever it's a reasonable amount of effort, tests should always be written against in-memory data structures and avoid doing real I/O, including any real asynchrony, since tests like that are easier to maintain and debug. Within Twisted itself, it should always be a reasonable amount of effort to write tests the "right" way. Any code which seems to require a real-asynchronous testing approach is either out of scope for Twisted or has other architectural problems which need to be addressed before landing it. Some applications which use Twisted will need to interact with other systems as part of their unit tests, and they should be able to use the reactor to do that. We should provide as many facilities to help people avoid writing tests like this as possible, but there are circumstances where we couldn't possibly provide enough, so we should continue to support trial's use as an integration testing tool. We don't currently provide enough facilities to make it easy to avoid the reactor in all the cases where you should avoid it, and we should really improve proto_helpers et. al. so that it's easier to find in-memory implementations of stuff.
The tests themselves are also complicated by the need to clean up those event sources.
Amen. Even today, the only document I'm aware of which actually covers this in its entirety is http://blackjml.livejournal.com/23029.html. I think _maybe_ the trial tutorial actually covers all the steps but I'll need to read it very carefully to see if I can spot anything that it doesn't address :).
The tests only exercise functionality against one reactor at a time which leads to additional challenges for buildbot.
AMEN. If we made all tests ReactorBuilder tests, would we be able to kill all the alternate-reactor (i.e. -r whatever) buildbots? It seems like that would speed up build results quite a bit.
If a test is for reactor functionality with multiple implementations, the ReactorBuilder-style tests are better. If a test is for implementation details of a particular reactor, I think the necessary parts of that reactor should just be invoked directly - on a new instance. For anything that's not a test for reactor functionality, no reactor should be involved at all (protocol implementations, etc).
As I said above, we have a few gaps in this area which we need to work on filling. For example, I recently wrote some tests where I wanted an IReactorCore provider but was dismayed to discover that Twisted doesn't provide an in-memory implementor of that interface anywhere, despite the fact that all I needed to call was addSystemEventTrigger/fireSystemEventTrigger, two APIs which just manipulate lists in memory even in their "real" implementations. I was bad open-source citizen and did not immediately file a bug! However, writing this email made me do so http://twistedmatrix.com/trac/ticket/5353. While the code that I was writing wasn't for Twisted itself, I do feel like this may be an area where the prevalence of bad examples comes from the fact that we don't give our users a lot of options for nice, isolated testing. It's definitely easier to just use the real reactor, despite its myriad issues, than to spend a week implementing comprehensive, testable versions of the ~40-odd interfaces in twisted.internet.interfaces. But that means that all tx* projects tend to test things in the worse style, which means that most new patch contributors are likely to be cribbing from bad examples when writing a patch for Twisted. So I filed another ticket for another frequent sticking point for me: http://twistedmatrix.com/trac/ticket/5354. Hopefully if we can make more such things popular it will more widely disseminate the skills necessary to write tests in a more self-contained style.
I've mentioned these ideas to various people at various times, but they don't seem to be catching on, so I'd like to hear why and come to some conclusion about how we're going to write tests in the future.
The new Trial tutorial provides a great resource that we were lacking for a long time, so we should make reference to its section on using StringTransport and Clock frequently and with enthusiasm. We a similar document for ReactorBuilder though. I think we may want to also write up a companion wiki page that is more the rhetorical rather than pedagogic side of this problem, collecting our explanations of why you really want to use in-memory stuff instead of "real" reactors. New contributors sometimes say that they don't feel like the code is "really" being tested unless they're testing against the real implementation of something, and it may not be immediately obvious that testing against a not-real implementation is desirable in many cases. Much of it can just be copy/pasted from this thread :).
On 11/01/2011 11:49 PM, Glyph wrote:
of "real" reactors. New contributors sometimes say that they don't feel like the code is "really" being tested unless they're testing against the real implementation of something, and it may not be immediately obvious that testing against a not-real implementation is desirable in many cases. Much of it can just be copy/pasted from this thread :).
Without commenting on everything else in the thread - this realisation was something I came to late, and I feel it would have helped enormously if it had come earlier. So, +1 on convincing people to go for in-memory / simulated transport/clock/etc. But I'll note that most people I've seen crib from existing test code during the early phases of TDD, tutorials or not - and if most test code in Twisted uses the real reactor...
On Nov 2, 2011, at 3:55 AM, Phil Mayers wrote:
But I'll note that most people I've seen crib from existing test code during the early phases of TDD, tutorials or not - and if most test code in Twisted uses the real reactor...
Thanks for volunteering to fix some of our existing tests, then! There's been a real shortage of labor for that ;-).
On 11/02/2011 10:56 PM, Glyph Lefkowitz wrote:
On Nov 2, 2011, at 3:55 AM, Phil Mayers wrote:
But I'll note that most people I've seen crib from existing test code during the early phases of TDD, tutorials or not - and if most test code in Twisted uses the real reactor...
Thanks for volunteering to fix some of our existing tests, then! There's been a real shortage of labor for that ;-).
Is there a ticket/list/trac report for "make this test be like X"?
On Nov 3, 2011, at 5:14 AM, Phil Mayers wrote:
On 11/02/2011 10:56 PM, Glyph Lefkowitz wrote:
On Nov 2, 2011, at 3:55 AM, Phil Mayers wrote:
But I'll note that most people I've seen crib from existing test code during the early phases of TDD, tutorials or not - and if most test code in Twisted uses the real reactor...
Thanks for volunteering to fix some of our existing tests, then! There's been a real shortage of labor for that ;-).
Is there a ticket/list/trac report for "make this test be like X"?
No, unfortunately. A big part of the work here is finding and cataloguing the work that needs to be done.
participants (12)
-
Alexey Ganzha
-
Anton Gyllenberg
-
Christopher Armstrong
-
Corbin Simpson
-
exarkun@twistedmatrix.com
-
Glyph
-
Glyph Lefkowitz
-
Itamar Turner-Trauring
-
Jonathan Lange
-
Laurens Van Houtven
-
Phil Mayers
-
Tim Allen