
Andrew Bennetts wrote:
Hmm, it's unlikely to be DNS lookups causing it, then.
We need some way to narrow down where it's happening. There are a few options I can think of, but they're all a bit heavyweight...
- Use strace to get some idea what it's doing - Use the --spew option of twistd (or manually install the spewer with "from twisted.python.util import spewer; sys.settrace(spewer)") - Use gdb to attach the process, then and look at the backtrace there.
(You can apparently get the python backtrace in gdb by putting this macro in your .gdbinit:
define ppystack while $pc < Py_Main || $pc > Py_GetArgcArgv if $pc > eval_frame && $pc < PyEval_EvalCodeEx set $__fn = PyString_AsString(co->co_filename) set $__n = PyString_AsString(co->co_name) printf "%s (%d): %s\n", $__fn, f->f_lineno, $__n end up-silently 1 end select-frame 0 end
But I've never tried this... )
Is it possible that feedparser is hanging on trying to parse that feed? Perhaps trying putting print statements before and after the feedparser.parse call.
Maybe the problem is there, but then I wouldn't answer the other question: "Why does it takes at most 30 second to parse all the remaining 350 feeds?" There is no network activity after the unlocking "Ctrl+C"... Gotta investigate then.
You should be able to test this theory by installing Twisted's resolver:
from twisted.names import client reactor.installResolver(client.createResolver())
client.createResolver makes a resonable effort to use your system's DNS configuration (by looking at /etc/resolve.conf on posix systems, for example), so it should work without any special arguments.
ok, it changes into a totally non-working script :)
I get a lot of these: [Failure instance: Traceback: exceptions.TypeError, unsubscriptable object /usr/lib/python2.3/site-packages/twisted/internet/defer.py:313:_runCallbacks /usr/lib/python2.3/site-packages/twisted/names/resolve.py:44:__call__ /usr/lib/python2.3/site-packages/twisted/names/common.py:36:query /usr/lib/python2.3/site-packages/twisted/names/common.py:104:lookupAllRecords /usr/lib/python2.3/site-packages/twisted/names/client.py:266:_lookup /usr/lib/python2.3/site-packages/twisted/names/client.py:214:queryUDP ]
Ouch. I wonder how that bug crept in? The twisted.names code is expecting a sequence of timeouts (to re-issue the query with, until failing at last), but twisted.internet is only giving it a single integer. I've filed a bug report for this: http://twistedmatrix.com/bugs/issue570, if you care :)
Sure :), this is the second bug for me, the first one was a documentation bug, the finger tutorial has some errors :).
Absolutely. I've heard similar complaints about straw, and I've been hoping some keen person would apply Twisted to fix the problem :)
That was my hope too, but since a friend of mine asked for an rss-aggregator made with twisted... I realized that someone wants me to be that keen person. Oooohhhh Which thing has the fate classified for me? Ooooooohhhhh :P -- Valentino Volonghi aka Dialtone Linux User #310274, Gentoo Proud User X Python Newsreader developer http://sourceforge.net/projects/xpn/