Twisted Isn't Specific (was Re: Trial balloon: microthreads library in stdlib)

On 02:20 am, greg.ewing@canterbury.ac.nz wrote:
If Twisted is designed so that it absolutely *has* to use its own special event mechanism, and everything else needs to be modified to suit its requirements, then it's part of the problem, not part of the solution.
I've often heard this complaint, usually of the form "that's twisted-specific". The thing is, Twisted isn't specific. Its event mechanism isn't "special". In fact it's hard to imagine how it might be made less "special" than it currently is. Whatever your event dispatch mechanism, *some* code has to be responsible for calling the OS API of select(), WaitForMultipleEvents(), g_main_loop_run(), or whatever. Twisted actually imposes very few requirements for code to participate in this, and was designed from day one specifically to be a generalized mainloop mechanism which would not limit you to one underlying multiplexing implementation, event-dispatch mechanism, or operating system if you used its API. There have even been a few hacks to integrate Twisted with the asyncore mainloop, but almost everyone who knows both asyncore and Twisted has rapidly decided to just port all of their code to Twisted rather than maintain that bridge. In fact, Twisted includes fairly robust support for threads, so if you really need it you can mix and match event-driven and blocking styles. Again, most people who try this find that it is just nicer to write straight to the Twisted API, but for those that need really it, such as Zope and other WSGI-contained applications, it is available. Aside from the perennial issue of restartable reactors (which could be resolved as part of a stdlib push), Twisted's event loop imposes very few constraints on your code. It provides a lot of tools, sure, but few of them are required. You don't even *need* to use Deferreds. Now, "Twisted", overall, can be daunting. It has a lot of conventions, a lot of different interfaces to memorize and deal with, but if you are using the main loop you don't have to necessarily care about our SSH, ECMA 48, NNTP, OSCAR or WebDAV implementation. Those are all built at a higher level. It may seem like I'm belaboring the point here, but every time a thread comes up on this list mentioning the possibility of a "standard" event mechanism, people who do not know it very well or haven't used it start in with implied FUD that Twisted is a big, complicated, inseparable hairy mess that has no place in the everyday Python programmer's life. It's tangled-up Deep Magic, only for Wizards Of The Highest Order. Alternatively, more specific to this method, it's highly specific and intricate and very tightly coupled. Nothing could be further from the truth. It is *strictly* layered to prevent pollution of the lower levels by higher level code, and all the dependencies are one-way. Maybe our introductory documentation isn't very good. Maybe event-driven programming is confusing for those expecting something else. Maybe the cute names of some of the modules are offputting. Still, all that aside, if you are looking for an event-driven networking engine, Twisted is about as straightforward as you can get without slavishly coding to one specific platform API. When you boil it down, Twisted's event loop is just a notification for "a connection was made", "some data was received on a connection", "a connection was closed", and a few APIs to listen or initiate different kinds of connections, start timed calls, and communicate with threads. All of the platform details of how data is delivered to the connections are abstracted away. How do you propose we would make a less "specific" event mechanism? I strongly believe that there is room for a portion of the Twisted reactor API to be standardized in the stdlib so that people can write simple event-driven code "out of the box" with Python, but still have the different plug-in implementations live in Twisted itself. The main thing blocking this on my end (why I am not writing PEPs, advocating for it more actively, etc) is that it is an extremely low priority, and other, higher level pieces of Twisted have more pressing issues (such as the current confusion in the "web" universe). Put simply, although it might be nice, nobody really *needs* it in the stdlib, so they're not going to spend the effort to get it there. If someone out there really had a need for an event mechanism in the standard library, though, I encourage them to look long and hard at how the existing interfaces in Twisted could be promoted to the standard library and continue to be maintained compatibly in both places. At the very least, standardizing on something very much like IProtocol would go a long way towards making it possible to write async clients and servers that could run out of the box in the stdlib as well as with Twisted, even if the specific hookup mechanism (listenTCP, listenSSL, et. al.) were incompatible - although a signature compatible callLater would probably be a must. As I said, I don't have time to write the PEPs myself, but I might fix some specific bugs if there were a clear set of issues preventing this from moving forward. Better integration with the standard library would definitely be a big win for both Twisted and Python.

On Wed, Feb 14, 2007, glyph@divmod.com wrote:
As I said, I don't have time to write the PEPs myself, but I might fix some specific bugs if there were a clear set of issues preventing this from moving forward. Better integration with the standard library would definitely be a big win for both Twisted and Python.
Here's where I'm coming from: My first experience with Twisted was excellent: I needed a web server in fifteen minutes to do my PyCon presentation, and it Just Worked. My second experience with Twisted? Well, I didn't really have one. My first experience was Twisted 1.1, and when I tried installing 2.0 on my Mac (about 1.5 years ago), it just didn't work. Combined with the difficulty of using the documentation and the fact that I was in a hurry, I rejected the Twisted solution. (My company needed an FTP server that did a callback every time a file got uploaded -- something that I expect would be very simple for Twisted.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I disrespectfully agree." --SJM

glyph@divmod.com wrote:
When you boil it down, Twisted's event loop is just a notification for "a connection was made", "some data was received on a connection", "a connection was closed", and a few APIs to listen or initiate different kinds of connections, start timed calls, and communicate with threads. All of the platform details of how data is delivered to the connections are abstracted away. How do you propose we would make a less "specific" event mechanism? But that is exactly the problem I have with Twisted. For HTTP it creates its own set of notifications instead of structuring the code similar to SocketServer (UDP and TCP), BaseHTTPServer, SimpleHTTPServer etc which are well understood in the python community and e.g. used by medusa and asyncore. Having to completely restructure one's own code is a high price.
Giving control away into a big framework that calls my own code for not so easy to understand reasons (for a twisted noob) does not give me a warm feeling. It's o.k. for complex applications like web servers but for small networking applications I'd like to have a chance to understand what's going on. Asyncore is so simple that it's easy to follow when I let it do the select() for me. That said, I conclude that the protocol implementations are superb but unfortunately to tightly coupled to the Twisted philosophy, sitting in the middle, trying to orchestrate instead of being easy to integrate. Joachim

glyph@divmod.com schrieb:
On 02:20 am, greg.ewing@canterbury.ac.nz wrote:
If Twisted is designed so that it absolutely *has* to use its own special event mechanism, and everything else needs to be modified to suit its requirements, then it's part of the problem, not part of the solution.
I've often heard this complaint, usually of the form "that's twisted-specific". The thing is, Twisted isn't specific. Its event mechanism isn't "special". In fact it's hard to imagine how it might be made less "special" than it currently is.
Whatever your event dispatch mechanism, *some* code has to be responsible for calling the OS API of select(), WaitForMultipleEvents(), g_main_loop_run(), or whatever. Twisted actually imposes very few requirements for code to participate in this, and was designed from day one specifically to be a generalized mainloop mechanism which would not limit you to one underlying multiplexing implementation, event-dispatch mechanism, or operating system if you used its API.
When I last looked at twisted (that is several years ago), there were several reactors - win32reactor, wxreactor, maybe even more. And they didn't even work too well. The problems I remember were that the win32reactor was limited to a only handful of handles, the wxreactor didn't process events when a wx modal dialog boy was displayed, and so on. Has this changed? Thanks, Thomas

I was the one on the Stackless list who last September or so proposed the idea of monkeypatching and I'm including that idea in my presentation for PyCon. See my early rough draft at http://www.stackless.com/pipermail/stackless/2007-February/002212.html which contains many details about using Stackless, though none on the Stackless implementation. (A lot on how to tie things together.) So people know, I am an applications programmer and not a systems programmer. Things like OS-specific event mechanisms annoy and frustrate me. If I could do away with hardware and still write useful programs I would. I have tried 3 times to learn Twisted. The first time I found and reported various problems and successes. See emails at http://www.twistedmatrix.com/pipermail/twisted-python/2003-June/thread.html The second time was to investigate a way to report upload progress: http://twistedmatrix.com/trac/ticket/288 and the third was to compare Allegra and Twisted http://www.dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_a... In all three cases I've found it hard to use Twisted because the code didn't do as I expected it to do and when something went wrong I got results which were hard to interpret. I believe others have similar problems and is one reason Twisted is considered to be "a big, complicated, inseparable hairy mess." I find the Stackless code also hard to understand. Eg, I don't know where the watchdog code is for the "run()" command. It uses several layers of macros and I haven't been able get it straight in my head. However, so far I've not run into strange errors in Stackless that I have in Twisted. I find the normal Python code relatively easy to understand. Stackless only provides threadlets. It does no I/O. Richard Tew developed a "stacklesssocket" module which emulates the API for the stdlib "socket" module. I tweaked it a bit and showed that by doing the monkeypatch import stacklesssocket import sys sys.modules["socket"] = stacklesssocket then code like "urllib.urlopen" became Stackless compatible. Eg, in my PyCon talk draft I show something like import slib # must monkeypatch before any other module imports "socket" slib.use_monkeypatch() import urllib2 import time import hashlib def fetch_and_reverse(host): t1 = time.time() s = urllib2.urlopen("http://"+host+"/").read()[::-1] dt = time.time() - t1 digest = hashlib.md5(s).hexdigest() print "hash of %r/ = %s in %.2f s" % (host, digest, dt) slib.main_tasklet(fetch_and_reverse)("www.python.org") slib.main_tasklet(fetch_and_reverse)("docs.python.org") slib.main_tasklet(fetch_and_reverse)("planet.python.org") slib.run_all() where the three fetches occur in parallel. The choice of asyncore is, I think, done because 1) it prevents needing an external dependency, 2) asyncore is smaller and easier to understand than Twisted, and 3) it was for demo/proof of concept purposes. While tempting to improve that module I know that Twisted has already gone though all the platform-specific crap and I don't want to go through it again myself. I don't want to write a reactor to deal with GTK, and one for OS X, and one for ... Another reason I think Twisted is considered "tangled-up Deep Magic, only for Wizards Of The Highest Order" is because it's infused with event-based processing. I've done a lot of SAX processing and I can say that few people think that way or want to go through the process of learning how. Compare, for example, the following f = urllib2.urlopen("http://example.com/") for i, line in enumerate(f): print ("%06d" % i), repr(line) with the normal equivalent in Twisted or other async-based system. Yet by using the Stackless socket monkeypatch, this same code works in an async framework. And the underlying libraries have a much larger developer base than Twisted. Want NNTP? "import nntplib" Want POP3? "import poplib" Plenty of documentation about them too. On the Stackless mailing list I have proposed someone work on a talk for EuroPython titled "Stackless and Twisted". Andrew Francis has been looking into how to do that. All the earlier quotes were lifted from glyph. Here's another:
When you boil it down, Twisted's event loop is just a notification for "a connection was made", "some data was received on a connection", "a connection was closed", and a few APIs to listen or initiate different kinds of connections, start timed calls, and communicate with threads. All of the platform details of how data is delivered to the connections are abstracted away.. How do you propose we would make a less "specific" event mechanism?
What would I need to do to extract this Twisted core so I could replace asyncore? I know at minimum I need "twisted.internet" and "twisted.python" (the latter for logging) and "twisted.persisted" for "styles.Ephemeral". But I say this hesitantly recalling the frustrations I had in dealing with a connection error in Twisted, described in the aforementioned link http://www.dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_a... I feel that using the phrase "just a" in the previously quoted text is an understatement. While the mechanics might be simple, there are many, many layers, as you can see in this stack trace. File "async_blast.py", line 55, in ? reactor.run() File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/posixbase.py", line 218, in run self.mainLoop() File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/posixbase.py", line 229, in mainLoop self.doIteration(t) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/selectreactor.py", line 133, in doSelect _logrun(selectable, _drdw, selectable, method, dict) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/log.py", line 53, in callWithLogger return callWithContext({"system": lp}, func, *args, **kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/log.py", line 38, in callWithContext return context.call({ILogContext: newCtx}, func, *args, **kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/context.py", line 59, in callWithContext return self.currentContext().callWithContext(ctx, func, *args, **kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/context.py", line 37, in callWithContext return func(*args,**kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/selectreactor.py", line 139, in _doReadOrWrite why = getattr(selectable, method)() File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/tcp.py", line 535, in doConnect self.failIfNotConnected(error.getConnectError((connectResult, os.strerror(connectResult)))) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/error.py", line 160, in getConnectError return klass(number, string) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/error.py", line 105, in __init__ traceback.print_stack() That feels like 6 layers too many, given that _logrun(selectable, _drdw, selectable, method, dict) return context.call({ILogContext: newCtx}, func, *args, **kw) return self.currentContext().callWithContext(ctx, func, *args, **kw) return func(*args, **kw) getattr(selectable, method()) klass(number, string) are all generic calls. (Note that I argued against the twisted.internet.error way of doing thing as it changed my error number on me and gave me a non-system-standard, non-i18n error message.) I do not think Twisted can be changed to be an async kernel of the sort I would like without making enough changes as to be incompatible with the existing code. Also, and I say this to stress the difficulties of an outsider in using Twisted, I don't understand what's meant by "IProtocol" in
At the very least, standardizing on something very much like IProtocol would go a long way towards making it possible to write async clients and servers
There are 37 pages (according to Google) in the twistedmatrix domain which talk about IProtocol and are not "API docs" or part of a ticket. IProtocol site:twistedmatrix.com -"API docs" -"twisted-commits" None provided insight. The API doc is at http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.I... but I don't know how to use it or even why it would work. How would I add that to an asyncore-based library? What would I need to support the adaption? There's a very high barrier to entry and while I know there are end rewards like support across many platforms I also know that I only really need to support server-side Mac and Linux boxes, and no GUIs, so asyncore may be good enough for my own work. Andrew dalke@dalkescientific.com
At the very least, standardizing on something very much like IProtocol would go a long way towards making it possible to write async clients and servers that could run out of the box in the stdlib as well as with Twisted, even if the specific hookup mechanism (listenTCP, listenSSL, et. al.) were incompatible - although a signature compatible callLater would probably be a must.
As I said, I don't have time to write the PEPs myself, but I might fix some specific bugs if there were a clear set of issues preventing this from moving forward. Better integration with the standard library would definitely be a big win for both Twisted and Python.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrewdalke%40gmail.com

On Thu, 15 Feb 2007 02:36:22 -0700, Andrew Dalke <dalke@dalkescientific.com> wrote:
I was the one on the Stackless list who last September or so proposed the idea of monkeypatching and I'm including that idea in my presentation for PyCon. See my early rough draft at http://www.stackless.com/pipermail/stackless/2007-February/002212.html which contains many details about using Stackless, though none on the Stackless implementation. (A lot on how to tie things together.)
So people know, I am an applications programmer and not a systems programmer. Things like OS-specific event mechanisms annoy and frustrate me. If I could do away with hardware and still write useful programs I would.
What a wonderful world it would be. :)
[snip]
In all three cases I've found it hard to use Twisted because the code didn't do as I expected it to do and when something went wrong I got results which were hard to interpret. I believe others have similar problems and is one reason Twisted is considered to be "a big, complicated, inseparable hairy mess."
I find the Stackless code also hard to understand. Eg, I don't know where the watchdog code is for the "run()" command. It uses several layers of macros and I haven't been able get it straight in my head. However, so far I've not run into strange errors in Stackless that I have in Twisted.
As you point out below, however, Twisted and stackless achieve different goals.
I find the normal Python code relatively easy to understand.
Stackless only provides threadlets. It does no I/O. Richard Tew developed a "stacklesssocket" module which emulates the API for the stdlib "socket" module. I tweaked it a bit and showed that by doing the monkeypatch
import stacklesssocket import sys sys.modules["socket"] = stacklesssocket
then code like "urllib.urlopen" became Stackless compatible. Eg, in my PyCon talk draft I show something like
It may be of interest to you to learn that a Twisted developer implement this model several years ago. It has not been further developed for a handful of reasons, at the core of which is the fact that it is very similar to pre-emptive threading in terms of application-level complexity. You gave several examples of the use of existing code which expects a blocking socket interface and which "just works" when the socket module is changed in this way. However, this is a slight simplification. Code written without expecting a context switch (exactly what happens when a socket operation is performed in this model) is not necessarily correct when context switches are suddenly introduced. Consider this extremely trivial example: x = 0 def foo(conn): global x a = x + 1 b = ord(conn.recv(1)) x = a + b return x Clearly, foo is not threadsafe. Global mutable state is a terrible, terrible thing. The point to note is that by introducing a context switch at the conn.recv(1) call, the same effect is achieved as by any other context switch: it becomes possible for foo to return an inconsistent result or otherwise corrupt its own state if another piece of code violates its assumptions and changes x while it is waiting for the recv call to complete. Is urllib2 threadsafe? I have heard complaints that it is not. I have looked at the code, and at least in its support for caching, it appears not to be. Perhaps it can be made threadsafe, but in requiring that, the advantage of having a whole suite of modules which will "just work" with a transparently context switching socket module are mostly lost.
[snip - urllib2/tasklet example]
The choice of asyncore is, I think, done because 1) it prevents needing an external dependency,
But if some new event loop is introduced into the standard library, then using it also will not require an external dependency. ;)
2) asyncore is smaller and easier to understand than Twisted,
While I hear this a lot, applications written with Twisted _are_ shorter and contain less irrelevant noise in the form of boilerplate than the equivalent asyncore programs. This may not mean that Twisted programs are easier to understand, but it is at least an objectively measurable metric.
and 3) it was for demo/proof of concept purposes. While tempting to improve that module I know that Twisted has already gone though all the platform-specific crap and I don't want to go through it again myself. I don't want to write a reactor to deal with GTK, and one for OS X, and one for ...
Now if we can only figure out a way for everyone to benefit from this without tying too many brains up in knots. :)
Another reason I think Twisted is considered "tangled-up Deep Magic, only for Wizards Of The Highest Order" is because it's infused with event-based processing. I've done a lot of SAX processing and I can say that few people think that way or want to go through the process of learning how.
Compare, for example, the following
f = urllib2.urlopen("http://example.com/") for i, line in enumerate(f): print ("%06d" % i), repr(line)
with the normal equivalent in Twisted or other async-based system.
Several years ago, Christopher Armstrong (hopefully he won't get too upset at me for mentioning him here) write a Twisted/ Stackless integration library. When greenlets came out, he write a similar library for integrating Twisted with those. He also wrote a utility generally referred to as "defgen", and James Knight updated it to take advantage of the Python 2.5 changes to generators. Through all of that, however, Twisted is still taking care of all of the nitty gritty platform details. Whether one uses stackless or greenlets or generators or any other mechanism, it is important to realize that the lexical structure of the code is not inherently tied to the networking library in use. If you want to write code in the style of the above for loop, you can do so with Twisted and stackless. The problems involved are much the same as those involving urllib2/stackless, but at least Twisted is prepared to deal with context switching around network events, so any bugs you encounter are likely to be due to mistaken assumptions in your own application code. :) For what it's worth, I prefer context switches to be explicit in the style of continuation passing so that I am less likely to introduce such bugs into my own code. This is, however, entirely at my discretion, and I am not about to force anyone else to develop their applications this way.
Yet by using the Stackless socket monkeypatch, this same code works in an async framework. And the underlying libraries have a much larger developer base than Twisted. Want NNTP? "import nntplib" Want POP3? "import poplib" Plenty of documentation about them too.
This is going to come out pretty harshly, for which I can only apologize in advance, but it bears mention. The quality of protocol implementations in the standard library is bad. As in "not good". Twisted's NNTP support is better (even if I do say so myself - despite only having been working on by myself, when I knew almost nothing about Twisted, and having essentially never been touched since). Twisted's POP3 support is fantastically awesome. Next to imaplib, twisted.mail.imap4 is a sparkling diamond. And each of these implements the server end of the protocol as well: you won't find that in the standard library for almost any protocol. As for the documentation, please compare these two pages: http://python.org/doc/lib/pop3-objects.html http://twistedmatrix.com/documents/current/api/twisted.mail.pop3.AdvancedPOP... I think it is fair to call them comparable. They could both stand some improvement, really. :) And if someone wants to argue that, if the POP3 client from Twisted is going to be added to the standard library, its documentation should be improved first, I'm not going to argue against that. Docs are great, more docs are greater. But let's bear in mind that at present, no one has suggested adding anything but the core Twisted event loop.
All the earlier quotes were lifted from glyph. Here's another:
When you boil it down, Twisted's event loop is just a notification for "a connection was made", "some data was received on a connection", "a connection was closed", and a few APIs to listen or initiate different kinds of connections, start timed calls, and communicate with threads. All of the platform details of how data is delivered to the connections are abstracted away.. How do you propose we would make a less "specific" event mechanism?
What would I need to do to extract this Twisted core so I could replace asyncore? I know at minimum I need "twisted.internet" and "twisted.python" (the latter for logging) and "twisted.persisted" for "styles.Ephemeral".
Neither of those dependencies is a very hard one. I suspect that there would be resistence to adding a new logging system to the standard library, just for Twisted. You have the right idea though. Some portion of twisted.internet, and whatever utility code it depends on.
But I say this hesitantly recalling the frustrations I had in dealing with a connection error in Twisted, described in the aforementioned link http://www.dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_a...
I feel that using the phrase "just a" in the previously quoted text is an understatement.
I think you're right. We throw around "just" a lot in our line of work, don't we? :) Twisted does also account for a raft of platform-specific quirks and inconsistencies. I take this to be a good thing.
While the mechanics might be simple, there are many, many layers, as you can see in this stack trace.
File "async_blast.py", line 55, in ? reactor.run() File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/posixbase.py", line 218, in run self.mainLoop() File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/posixbase.py", line 229, in mainLoop self.doIteration(t) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/selectreactor.py", line 133, in doSelect _logrun(selectable, _drdw, selectable, method, dict) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/log.py", line 53, in callWithLogger return callWithContext({"system": lp}, func, *args, **kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/log.py", line 38, in callWithContext return context.call({ILogContext: newCtx}, func, *args, **kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/context.py", line 59, in callWithContext return self.currentContext().callWithContext(ctx, func, *args, **kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/python/context.py", line 37, in callWithContext return func(*args,**kw) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/selectreactor.py", line 139, in _doReadOrWrite why = getattr(selectable, method)() File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/tcp.py", line 535, in doConnect self.failIfNotConnected(error.getConnectError((connectResult, os.strerror(connectResult)))) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/error.py", line 160, in getConnectError return klass(number, string) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/error.py", line 105, in __init__ traceback.print_stack()
That feels like 6 layers too many, given that _logrun(selectable, _drdw, selectable, method, dict) return context.call({ILogContext: newCtx}, func, *args, **kw) return self.currentContext().callWithContext(ctx, func, *args, **kw) return func(*args, **kw) getattr(selectable, method()) klass(number, string)
are all generic calls.
I know function calls are expensive in Python, and method calls even more so... but I still don't understand this issue. Twisted's call stack is too deep? It is fair to say it is deep, I guess, but I don't see how that is a problem. If it is, I don't see how it is specific to this discussion.
(Note that I argued against the twisted.internet.error way of doing thing as it changed my error number on me and gave me a non-system-standard, non-i18n error message.)
Note that we ended up /not/ changing the error number in the case you encountered. We changed the connection setup code to handle the unexpected behavior on OS X. :) Twisted is faithfully reporting the same errno as the underlying platform is producing. Since most applications don't know or care about such things though, it is also putting it into an exception instance which indicates the category into which the error falls. These all seem like good things to me.
I do not think Twisted can be changed to be an async kernel of the sort I would like without making enough changes as to be incompatible with the existing code.
Also, and I say this to stress the difficulties of an outsider in using Twisted, I don't understand what's meant by "IProtocol" in
At the very least, standardizing on something very much like IProtocol would go a long way towards making it possible to write async clients and servers
It is exactly these interfaces which make it possible to make changes to Twisted without breaking things. The behaviour of the APIs exposed by Twisted is defined. Even if IProtocol is not adopted verbatim, the existence of IProtocol and another interface means one can be adapted to the other in some manner, providing compatibility for existing applications.
There are 37 pages (according to Google) in the twistedmatrix domain which talk about IProtocol and are not "API docs" or part of a ticket.
IProtocol site:twistedmatrix.com -"API docs" -"twisted-commits"
None provided insight. The API doc is at http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.I...
Since the standard library lacks interfaces, it may be the case that IProtocol is instead adopted as an ABC or even that it won't appear in code at all, but instead be translated into non-source documentation. I'd prefer if z.i were in the stdlib, but that's a separate issue. What Glyph is saying when he talks about standardizing IProtocol is the standardization of an API, nothing more. Which is what this whole thread is about, if I am not mistaken. I apologize for writing such a long message, but I didn't have time to write a shorter one. Jean-Paul

On Thu, Feb 15, 2007 at 09:19:30AM -0500, Jean-Paul Calderone wrote:
That feels like 6 layers too many, given that _logrun(selectable, _drdw, selectable, method, dict) return context.call({ILogContext: newCtx}, func, *args, **kw) return self.currentContext().callWithContext(ctx, func, *args, **kw) return func(*args, **kw) getattr(selectable, method()) klass(number, string)
are all generic calls.
I know function calls are expensive in Python, and method calls even more so... but I still don't understand this issue. Twisted's call stack is too deep? It is fair to say it is deep, I guess, but I don't see how that is a problem. If it is, I don't see how it is specific to this discussion.
It's hard to debug the resulting problem. Which level of the *12* levels in the stack trace is responsible for a bug? Which of the *6* generic calls is calling the wrong thing because a handler was set up incorrectly or the wrong object provided? The code is so 'meta' that it becomes effectively undebuggable. --amk

On Thu, 15 Feb 2007 10:46:05 -0500, "A.M. Kuchling" <amk@amk.ca> wrote:
On Thu, Feb 15, 2007 at 09:19:30AM -0500, Jean-Paul Calderone wrote:
That feels like 6 layers too many, given that _logrun(selectable, _drdw, selectable, method, dict) return context.call({ILogContext: newCtx}, func, *args, **kw) return self.currentContext().callWithContext(ctx, func, *args, **kw) return func(*args, **kw) getattr(selectable, method()) klass(number, string)
are all generic calls.
I know function calls are expensive in Python, and method calls even more so... but I still don't understand this issue. Twisted's call stack is too deep? It is fair to say it is deep, I guess, but I don't see how that is a problem. If it is, I don't see how it is specific to this discussion.
It's hard to debug the resulting problem. Which level of the *12* levels in the stack trace is responsible for a bug? Which of the *6* generic calls is calling the wrong thing because a handler was set up incorrectly or the wrong object provided? The code is so 'meta' that it becomes effectively undebuggable.
I've debugged plenty of Twisted applications. So it's not undebuggable. :) Application code tends to reside at the bottom of the call stack, so Python's traceback order puts it right where you're looking, which makes it easy to find. For any bug which causes something to be set up incorrectly and only later manifests as a traceback, I would posit that whether there is 1 frame or 12, you aren't going to get anything useful out of the traceback. Standard practice here is just to make exception text informative, I think, but this is another general problem with Python programs and event loops, not one specific to either Twisted itself or the particular APIs Twisted exposes. As a personal anecdote, I've never once had to chase a bug through any of the 6 "generic calls" singled out. I can't think of a case where I've helped any one else who had to do this, either. That part of Twisted is very old, it is _very_ close to bug-free, and application code doesn't have very much control over it at all. Perhaps in order to avoid scaring people, there should be a way to elide frames from a traceback (I don't much like this myself, I worry about it going wrong and chopping out too much information, but I have heard other people ask for it)? Jean-Paul

On Thu, 15 Feb 2007 10:46:05 -0500, "A.M. Kuchling" <amk@amk.ca> wrote:
It's hard to debug the resulting problem. Which level of the *12* levels in the stack trace is responsible for a bug? Which of the *6* generic calls is calling the wrong thing because a handler was set up incorrectly or the wrong object provided? The code is so 'meta' that it becomes effectively undebuggable.
On 2/15/07, Jean-Paul Calderone <exarkun@divmod.com> wrote,
I've debugged plenty of Twisted applications. So it's not undebuggable. :)
Hence the word "effectively". Or are you offering to be on-call within 5 minutes for anyone wanting to debug code? Not very scalable that. The code I was talking about took me an hour to track down and I could only find the location be inserting a "print traceback" call to figure out where I was.
Application code tends to reside at the bottom of the call stack, so Python's traceback order puts it right where you're looking, which makes it easy to find.
As I also documented, Twisted tosses a lot of the call stack. Here is the complete and full error message I got: Error: [Failure instance: Traceback (failure with no frames): twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 22: Invalid argument. ] I wrote the essay at http://www.dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_a... to, among others, show just how hard it is to figure things out in Twisted.
For any bug which causes something to be set up incorrectly and only later manifests as a traceback, I would posit that whether there is 1 frame or 12, you aren't going to get anything useful out of the traceback.
I posit that tracebacks are useful. Consider: def blah(): make_network_request("A") make_network_request("B") where "A" and "B" are encoded as part of a HTTP POST payload to the same URI. If there's an error in the network connection - eg, the implementation for "B" on the server dies so the connection closes w/o a response - then knowning that the call for "B" failed and not "A" is helpful during debugging. The low level error message cannot report that. Yes, I could put my own try blocks around everything and contextualize all of the error messages so they are semantically correct for the given level of code. But that I would be a lot of code, hard to test, and not cost effective.
Standard practice here is just to make exception text informative, I think,
If you want to think of it as "exception text" then consider that the stack trace is "just" more text for the message.
but this is another general problem with Python programs and event loops, not one specific to either Twisted itself or the particular APIs Twisted exposes.
The thread is "Twisted Isn't Specific", as a branch of a discussion on microthreads in the standard library. As someone experimenting with Stackless and how it can be used on top of an async library I feel competent enough to comment on the latter topic. As someone who has seen the reverse Bozo bit set by Twisted people on everyone who makes the slightest comment towards using any other async library, and has documented evidence as to just why one might do so, I also feel competent enough to comment on the branch topic. My belief is that there are too many levels of generiticity in Twisted. This makes is hard for an outsider to come in and use the system. By "use" I include 1) understanding how the parts go together, 2) diagnose problems and 3) adding new features that Twisted doesn't support. Life is trade offs. A Twisted trade off is generiticity at the cost of understandability. Remember, this is all my belief, backed by examples where I did try to understand. My experience with other networking packages have been much easier, including with asyncore and Allegra. They are not as general purpose, but it's hard for me to believe the extra layers in Twisted are needed to get that extra whatever functionality. My other belief is that async programming is hard for most people, who would rather do "normal" programming instead of "inside-out" programming. Because of this 2nd belief I am interested in something like Stackless on top of an async library.
As a personal anecdote, I've never once had to chase a bug through any of the 6 "generic calls" singled out. I can't think of a case where I've helped any one else who had to do this, either. That part of Twisted is very old, it is _very_ close to bug-free, and application code doesn't have very much control over it at all. Perhaps in order to avoid scaring people, there should be a way to elide frames from a traceback (I don't much like this myself, I worry about it going wrong and chopping out too much information, but I have heard other people ask for it)?
Even though I said some of this earlier I'll elaborate for clarification. The specific bug I was tracking down had *no* traceback. There was nothing to elide. Because there was no traceback I couldn't figure out where the error came from. I had to use the error message text to find the error class, from there modify the source code to generate a traceback, then work up the stack to find the code which had the actual error. Here is the tail end of the traceback. File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/tcp.py", line 535, in doConnect self.failIfNotConnected(error.getConnectError((connectResult, os.strerror(connectResult)))) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/error.py", line 160, in getConnectError return klass(number, string) File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/ site-packages/twisted/internet/error.py", line 105, in __init__ traceback.print_stack() You can see the actual error occured at [-3] in the stack where the os.strerror() was. One layer of genericity is mapping OS-level error codes as integers into error classes, one class per integer. You previously said that my problem was resolved thusly: Note that we ended up /not/ changing the error number in the case you encountered. We changed the connection setup code to handle the unexpected behavior on OS X. :) This means that at least someone did help someone track a bug which were affected by those levels of abstraction. BTW, the ticket is at http://twistedmatrix.com/trac/ticket/2022 and fix was r18064. The final solution was to # doConnect gets called twice. The first time we actually need to # start the connection attempt. The second time we don't really # want to (SO_ERROR above will have taken care of any errors, and if # it reported none, the mere fact that doConnect was called again is # sufficient to indicate that the connection has succeeded), but it # is not /particularly/ detrimental to do so. This should get # cleaned up some day, though. and has nothing to do with changing the error number. Twisted was using the 2nd error code when it should have used the 1st. That was the reason for my getting the "wrong" number. It was the right number for a different check for an error. Note that last comment -- the double call to doConnect was the problem, and a source of my confusion. It remain, just neutered. Also note that that patch included removing code from error.py errno.ENETUNREACH: NoRouteError, errno.ECONNREFUSED: ConnectionRefusedError, errno.ETIMEDOUT: TCPTimedOutError, - # for FreeBSD - might make other unices in certain cases - # return wrong exception, alas - errno.EINVAL: ConnectionRefusedError, which was part the mixup that gave me problems. This definitely was an error in one of those levels of abstraction. It was a bad fix earlier "fixed" by incorrectly mapping an error code, probably on the justification of there being an OS error rather than a Twisted implementation problem. But that's just a wild guess based solely on seeing other fixes of that type. To bring this back into python-dev, .... none of this is a topic for python-dev. I'm reacting to what I perceive as a overly territorial response that occurs nearly every time the words "Twisted", "asynchronous I/O", "reactor" or "main event loop" is uttered. I think using microthreads/stackless/... makes an interesting and useful alternative to the Twisted approach, including different ways to structure the main event loop. I think anyone who's been involved with Python and on this list knows the work Twisted has done to understand platform problems, and needs at most a hint to look at Twisted for insight. Though I feel that such insight is obscured. That said, I resign from this thread and I'll do additional responses in private mail. Andrew dalke@dalkescientific.com

Jean-Paul Calderone <exarkun@divmod.com> wrote:
On Thu, 15 Feb 2007 02:36:22 -0700, Andrew Dalke <dalke@dalkescientific.com> wrote: [snip]
2) asyncore is smaller and easier to understand than Twisted,
While I hear this a lot, applications written with Twisted _are_ shorter and contain less irrelevant noise in the form of boilerplate than the equivalent asyncore programs. This may not mean that Twisted programs are easier to understand, but it is at least an objectively measurable metric.
In my experience, the boilerplate is generally incoming and outgoing buffers. If both had better (optional default) implementations, and perhaps a way of saying "use the default implementations of handle_close, etc.", then much of the boilerplate would vanish. People would likely implement a found_terminator method and be happy.
and 3) it was for demo/proof of concept purposes. While tempting to improve that module I know that Twisted has already gone though all the platform-specific crap and I don't want to go through it again myself. I don't want to write a reactor to deal with GTK, and one for OS X, and one for ...
Now if we can only figure out a way for everyone to benefit from this without tying too many brains up in knots. :)
Whenever I need to deal with these kinds of things (in wxPython specifically), I usually set up a wxTimer to signal asyncore.poll(timeout=0), but I'm lazy, and rarely need significant throughput in my GUI applications. [snip]
Yet by using the Stackless socket monkeypatch, this same code works in an async framework. And the underlying libraries have a much larger developer base than Twisted. Want NNTP? "import nntplib" Want POP3? "import poplib" Plenty of documentation about them too.
This is going to come out pretty harshly, for which I can only apologize in advance, but it bears mention. The quality of protocol implementations in the standard library is bad. As in "not good". Twisted's NNTP support is better (even if I do say so myself - despite only having been working on by myself, when I knew almost nothing about Twisted, and having essentially never been touched since). Twisted's POP3 support is fantastically awesome. Next to imaplib, twisted.mail.imap4 is a sparkling diamond. And each of these implements the server end of the protocol as well: you won't find that in the standard library for almost any protocol.
Protocol support is hit and miss. NNTP in Python could be better, but that's not an asyncore issue (being that nntplib isn't implemented using asyncore), that's an "NNTP in Python could be done better" issue. Is it worth someone's time to patch it, or should they just use Twisted? Well, if we start abandoning stdlib modules, "because they can always use Twisted", then we may as well just ship Twisted with Python. - Josiah

On Thu, 15 Feb 2007 13:55:31 -0800, Josiah Carlson <jcarlson@uci.edu> wrote:
Jean-Paul Calderone <exarkun@divmod.com> wrote: [snip]
Now if we can only figure out a way for everyone to benefit from this without tying too many brains up in knots. :)
Whenever I need to deal with these kinds of things (in wxPython specifically), I usually set up a wxTimer to signal asyncore.poll(timeout=0), but I'm lazy, and rarely need significant throughput in my GUI applications.
And I guess you also don't mind that on OS X this is often noticably broken? :)
[snip]
Protocol support is hit and miss. NNTP in Python could be better, but that's not an asyncore issue (being that nntplib isn't implemented using asyncore), that's an "NNTP in Python could be done better" issue. Is it worth someone's time to patch it, or should they just use Twisted? Well, if we start abandoning stdlib modules, "because they can always use Twisted", then we may as well just ship Twisted with Python.
We could always replace the stdlib modules with thin compatibility layers based on the Twisted protocol implementations. It's trivial to turn an asynchronous API into a synchronous one. I think you are correct in marking this an unrelated issue, though. Jean-Paul

Jean-Paul Calderone <exarkun@divmod.com> wrote:
On Thu, 15 Feb 2007 13:55:31 -0800, Josiah Carlson <jcarlson@uci.edu> wrote:
Jean-Paul Calderone <exarkun@divmod.com> wrote: [snip]
Now if we can only figure out a way for everyone to benefit from this without tying too many brains up in knots. :)
Whenever I need to deal with these kinds of things (in wxPython specifically), I usually set up a wxTimer to signal asyncore.poll(timeout=0), but I'm lazy, and rarely need significant throughput in my GUI applications.
And I guess you also don't mind that on OS X this is often noticably broken? :)
I don't own a Mac, and so far, of the perhaps dozen or so Mac users of the software that does this, I've heard no reports of it being broken. From what I understand, wxTimers work on all supported platforms (which includes OSX), and if asyncore.poll() is broken on Macs, then someone should file a bug report. If it's asyncore's fault, assign it to me, otherwise someone with Mac experience needs to dig into it.
[snip] Protocol support is hit and miss. NNTP in Python could be better, but that's not an asyncore issue (being that nntplib isn't implemented using asyncore), that's an "NNTP in Python could be done better" issue. Is it worth someone's time to patch it, or should they just use Twisted? Well, if we start abandoning stdlib modules, "because they can always use Twisted", then we may as well just ship Twisted with Python.
We could always replace the stdlib modules with thin compatibility layers based on the Twisted protocol implementations. It's trivial to turn an asynchronous API into a synchronous one. I think you are correct in marking this an unrelated issue, though.
If the twisted folks (or anyone else) want to implement a "shim" that pretends to be nntplib, it's their business whether calling twisted.internet.monkeypatch.nntplib() does what the name suggests. :) That is to say, I don't believe anyone would be terribly distraught if there was an easy way to use Twisted without drinking the kool-aid. Then again, I do believe that it makes sense to patch the standard library whenever possible - if Twisted has better parsing of nntp, smtp, pop3, imap4, etc. responses, then perhaps we should get the original authors to sign a PSF contributor agreement, and we could translate whatever is better. - Josiah
participants (8)
-
A.M. Kuchling
-
Aahz
-
Andrew Dalke
-
glyph@divmod.com
-
Jean-Paul Calderone
-
Joachim König-Baltes
-
Josiah Carlson
-
Thomas Heller