[Python-Dev] Twisted Isn't Specific (was Re: Trial balloon: microthreads library in stdlib)
Andrew Dalke
dalke at dalkescientific.com
Thu Feb 15 10:36:22 CET 2007
I was the one on the Stackless list who last September or so
proposed the idea of monkeypatching and I'm including that
idea in my presentation for PyCon. See my early rough draft
at http://www.stackless.com/pipermail/stackless/2007-February/002212.html
which contains many details about using Stackless, though
none on the Stackless implementation. (A lot on how to tie things together.)
So people know, I am an applications programmer and not a
systems programmer. Things like OS-specific event mechanisms
annoy and frustrate me. If I could do away with hardware and
still write useful programs I would.
I have tried 3 times to learn Twisted. The first time I found
and reported various problems and successes. See emails at
http://www.twistedmatrix.com/pipermail/twisted-python/2003-June/thread.html
The second time was to investigate a way to report upload
progress: http://twistedmatrix.com/trac/ticket/288
and the third was to compare Allegra and Twisted
http://www.dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_abstraction.html
In all three cases I've found it hard to use Twisted because
the code didn't do as I expected it to do and when something
went wrong I got results which were hard to interpret. I
believe others have similar problems and is one reason Twisted
is considered to be "a big, complicated, inseparable hairy mess."
I find the Stackless code also hard to understand. Eg,
I don't know where the watchdog code is for the "run()"
command. It uses several layers of macros and I haven't
been able get it straight in my head. However, so far
I've not run into strange errors in Stackless that I
have in Twisted.
I find the normal Python code relatively easy to understand.
Stackless only provides threadlets. It does no I/O.
Richard Tew developed a "stacklesssocket" module which emulates
the API for the stdlib "socket" module. I tweaked it a
bit and showed that by doing the monkeypatch
import stacklesssocket
import sys
sys.modules["socket"] = stacklesssocket
then code like "urllib.urlopen" became Stackless compatible.
Eg, in my PyCon talk draft I show something like
import slib
# must monkeypatch before any other module imports "socket"
slib.use_monkeypatch()
import urllib2
import time
import hashlib
def fetch_and_reverse(host):
t1 = time.time()
s = urllib2.urlopen("http://"+host+"/").read()[::-1]
dt = time.time() - t1
digest = hashlib.md5(s).hexdigest()
print "hash of %r/ = %s in %.2f s" % (host, digest, dt)
slib.main_tasklet(fetch_and_reverse)("www.python.org")
slib.main_tasklet(fetch_and_reverse)("docs.python.org")
slib.main_tasklet(fetch_and_reverse)("planet.python.org")
slib.run_all()
where the three fetches occur in parallel.
The choice of asyncore is, I think, done because 1) it
prevents needing an external dependency, 2) asyncore is
smaller and easier to understand than Twisted, and
3) it was for demo/proof of concept purposes. While
tempting to improve that module I know that Twisted
has already gone though all the platform-specific crap
and I don't want to go through it again myself. I don't
want to write a reactor to deal with GTK, and one for
OS X, and one for ...
Another reason I think Twisted is considered "tangled-up
Deep Magic, only for Wizards Of The Highest Order" is because
it's infused with event-based processing. I've done a lot
of SAX processing and I can say that few people think that
way or want to go through the process of learning how.
Compare, for example, the following
f = urllib2.urlopen("http://example.com/")
for i, line in enumerate(f):
print ("%06d" % i), repr(line)
with the normal equivalent in Twisted or other
async-based system.
Yet by using the Stackless socket monkeypatch, this
same code works in an async framework. And the underlying
libraries have a much larger developer base than Twisted.
Want NNTP? "import nntplib" Want POP3? "import poplib"
Plenty of documentation about them too.
On the Stackless mailing list I have proposed someone work
on a talk for EuroPython titled "Stackless and Twisted".
Andrew Francis has been looking into how to do that.
All the earlier quotes were lifted from glyph. Here's another:
> When you boil it down, Twisted's event loop is just a
> notification for "a connection was made", "some data was
> received on a connection", "a connection was closed", and
> a few APIs to listen or initiate different kinds of
> connections, start timed calls, and communicate with threads.
> All of the platform details of how data is delivered to the
> connections are abstracted away.. How do you propose we
> would make a less "specific" event mechanism?
What would I need to do to extract this Twisted core so
I could replace asyncore? I know at minimum I need
"twisted.internet" and "twisted.python" (the latter for
logging) and "twisted.persisted" for "styles.Ephemeral".
But I say this hesitantly recalling the frustrations
I had in dealing with a connection error in Twisted,
described in the aforementioned link
http://www.dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_abstraction.html
I feel that using the phrase "just a" in the previously quoted
text is an understatement. While the mechanics might be
simple, there are many, many layers, as you can see in this
stack trace.
File "async_blast.py", line 55, in ?
reactor.run()
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/posixbase.py", line 218, in run
self.mainLoop()
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/posixbase.py", line 229, in mainLoop
self.doIteration(t)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/selectreactor.py", line 133, in doSelect
_logrun(selectable, _drdw, selectable, method, dict)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/python/log.py", line 53, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/python/log.py", line 38, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/python/context.py", line 59, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/python/context.py", line 37, in callWithContext
return func(*args,**kw)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/selectreactor.py", line 139, in _doReadOrWrite
why = getattr(selectable, method)()
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/tcp.py", line 535, in doConnect
self.failIfNotConnected(error.getConnectError((connectResult,
os.strerror(connectResult))))
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/error.py", line 160, in getConnectError
return klass(number, string)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/
site-packages/twisted/internet/error.py", line 105, in __init__
traceback.print_stack()
That feels like 6 layers too many, given that
_logrun(selectable, _drdw, selectable, method, dict)
return context.call({ILogContext: newCtx}, func, *args, **kw)
return self.currentContext().callWithContext(ctx, func, *args, **kw)
return func(*args, **kw)
getattr(selectable, method())
klass(number, string)
are all generic calls. (Note that I argued against the
twisted.internet.error way of doing thing as it changed my
error number on me and gave me a non-system-standard, non-i18n
error message.)
I do not think Twisted can be changed to be an async
kernel of the sort I would like without making enough
changes as to be incompatible with the existing code.
Also, and I say this to stress the difficulties of an outsider
in using Twisted, I don't understand what's meant by "IProtocol" in
> At the very least, standardizing on something very much like
> IProtocol would go a long way towards making it possible to
> write async clients and servers
There are 37 pages (according to Google) in the twistedmatrix domain
which talk about IProtocol and are not "API docs" or part of a ticket.
IProtocol site:twistedmatrix.com -"API docs" -"twisted-commits"
None provided insight. The API doc is at
http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IProtocol.html
but I don't know how to use it or even why it would work. How would
I add that to an asyncore-based library? What would I need to
support the adaption? There's a very high barrier to entry and while
I know there are end rewards like support across many platforms
I also know that I only really need to support server-side Mac
and Linux boxes, and no GUIs, so asyncore may be good enough
for my own work.
Andrew
dalke at dalkescientific.com
> At the very least, standardizing on something very much like IProtocol would
> go a long way towards making it possible to write async clients and servers
> that could run out of the box in the stdlib as well as with Twisted, even if
> the specific hookup mechanism (listenTCP, listenSSL, et. al.) were
> incompatible - although a signature compatible callLater would probably be a
> must.
>
> As I said, I don't have time to write the PEPs myself, but I might fix some
> specific bugs if there were a clear set of issues preventing this from
> moving forward. Better integration with the standard library would
> definitely be a big win for both Twisted and Python.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/andrewdalke%40gmail.com
>
>
More information about the Python-Dev
mailing list