On Sat, 17 Dec 2005 23:14:10 +0100, Jesus Cea <jcea@argo.es> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Twisted 2.1, twisted.named 0.2, here.
I'm taking my first steps with Twisted (documentation -inexistence- nightmare :-), and my first project will be a bulk mailer as the backend of my mailing list system.
The application would take the message and the subscriber list and a) resolve the MX for the domains and b) connect to the MX's and send the message, trying to minimice traffic sending a single envelope for several recipients sharing the domain or the MX's.
I'm doing currently the DNS stuff. The result are promising, resolving about 200 domains per second in a 1.4GHz P4, so my biggest mailing list (about 31500 unique domains, mĂșltiple subscribers per domain) is "resolved" in less than three minutes.
Nice so far. The demo code (2Kbytes) is the following (if I'm violating the rules posting this code, please tell me):
===== # File "dns.tac"
from twisted.application import service
application = service.Application("DNS test")
You probably want to move most of your program out if "dns.tac" and into an importable Python module. Code defined inside .tac files lives in a weird world where some surprising rules apply. It's best to keep the .tac file as short as possible. Generally, you just want to create an Application and give it some children, importing from modules the definitions of all classes and functions needed to set this up.
import time t=time.time()
class resolucion(object) : def __init__(self,dominio) : from twisted.names import client d = client.lookupMailExchange(dominio,timeout=(60,))
Passing (60,) as the timeout might not be the best idea. This will cause the DNS client to send one request and then wait 60 seconds for a response. If either the request or the response is dropped (as often happens with UDP traffic), you will never get a result, and you will have to wait 60 seconds to discover this fact. If you don't want retransmission, a value of (15,) or so is probably better. However, I suspect you really do want retransmissions. The default timeout is also 60 seconds total, but performs several retransmissions during the interim.
d.addCallbacks(self._cbMailExchange, self._ebMailExchange) self.dominio=dominio
def _cbMailExchange(self,results): # Callback for MX query global aun_pendientes aun_pendientes-=1 if not aun_pendientes : print "OK",time.time()-t return from twisted.internet import reactor reactor.stop() return if not len(pendientes) : return
resolucion(pendientes.pop()) from twisted.names.dns import QUERY_TYPES for i in results[0] : n=i.payload.name tipo=QUERY_TYPES[i.payload.TYPE] if tipo=="MX" :
You can just use dns.MX here, instead of looking up "MX" in QUERY_TYPES.
return p=i.payload.preference print n,p, for j in results[2] : if n==j.name : print j.payload.dottedQuad(),"(%d)" %j.ttl break else : print "???" elif tipo=="CNAME" : redirigidos.append((self.dominio,i.payload.name))
def _ebMailExchange(self,failure): # Error callback for MX query global aun_pendientes aun_pendientes-=1 if not aun_pendientes : print "ERROR",time.time()-t return from twisted.internet import reactor reactor.stop() return if not len(pendientes) : return
resolucion(pendientes.pop()) print "XXX",self.dominio print 'Lookup failed:' failure.printTraceback()
pendientes=[] redirigidos=[]
f=open("domain_list") for i in f : pendientes.append(i)
aun_pendientes=len(pendientes)
concurrencia=1000
for i in pendientes[:concurrencia] : resolucion(i)
from twisted.names import client client.theResolver.resolvers[-1].dynServers=[('127.0.0.1', 53)] # client.theResolver.resolvers=[client.theResolver.resolvers[-1]]
To customize the server used by the resolver, you may want to create your own resolver instance, rather than relying on the defaults guessed by the resolver automatically created in the client module.
pendientes=pendientes[concurrencia:]
=====
I launch the code as "twistd -ny dns.tac".
The demo does 1000 resolutions in parallel. If you experiment with the code, reduce the value.
Questions:
1. I get a warning: "[Uninitialized] /usr/local/lib/python2.4/site-packages/twisted/names/dns.py:1227: exceptions.DeprecationWarning: Deferred.setTimeout is deprecated. Look for timeout support specific to the API you are using instead."
I'm using, the native "twisted.names" timeout API, as far as I know...
This is a problem internal to twisted.names. Your code isn't doing anything wrong to cause it. Hopefully this will be fixed by the next release.
2. By default "twisted.names.client" uses the "/etc/resolv.conf" file to know which nameservers to use. I, nevertheless, want to use a particular nameserver, so:
2.1. I couldn't to find an appropiate API. I had to do a "hack", reading the "twisted.names" core to know implementation details: "client.theResolver.resolvers[-1].dynServers=[('127.0.0.1', 53)]"
2.2. The previous "hack" is only effective for future "twisted.names.client" instances. The previous ones use the "/etc/resolv.conf" entries. Putting the "hack" code before any instance creation doesn't work.
2.3. While reading the framework code, I saw that "client" uses a resolver chain: host, cache, network. But the cache is initially clear (of course) and NEVER ever gets populated, so we are not using it but checking missing entries eats CPU: 155 seconds for the unchanged code, 125 seconds if I drop the host and cache resolvers.
A caching client would be very nice, if the client is long running (my original idea).
All three of these can be addressed by constructing your own resolver: from twisted.names import client myResolver = client.Resolver(servers=[('127.0.0.1', 53)]) This gives you a resolver which uses only localhost, doesn't involve any nasty hacks, and doesn't have an /etc/hosts resolver or a caching resolver to slow things down.
2.4. The resolution failure code is only called if the resolution timeouts. But if the domain doesn't exists, the code called is the "success" one, with a "nil" answer. So we can't diferenciate between inexistant domains and inexistant RRs.
Hmm. The non-existence of the domain is hidden by the very last step in performing the lookup. The Resolver class has a method, filterAnswers, which is used to turn a DNS response into the three-tuple of lists which all the lookup* methods return. You may want to subclass Resolver and override filterAnswers to behave differently when the `message' argument it is given has an `rCode' attribute equal to twisted.names.dns.ENAME, which indicates the name requested does not exist.
3. How can I stop this ".tac"?. If I do "reactor.stop()", I get an infinite error, repeated forever:
reactor.stop() is the correct way to end the program. If you still have this problem after you have split the program into multiple files, please post again. Jean-Paul