python proxy checker ,change to threaded version

elca highcar at gmail.com
Mon Dec 7 03:16:46 EST 2009




r0g wrote:
> 
> elca wrote:
>> Hello ALL,
>> 
>> i have some python proxy checker .
>> 
>> and to speed up check, i was decided change to mutlthreaded version,
>> 
>> and thread module is first for me, i was tried several times to convert
>> to
>> thread version
>>  
>> and look for many info, but it not so much easy for novice python
>> programmar
>> .
>> 
>> if anyone can help me really much appreciate!! 
>> 
>> thanks in advance!
>> 
>> 
>>     import urllib2, socket
>>     
>>     socket.setdefaulttimeout(180)
>>     # read the list of proxy IPs in proxyList
>>     proxyList = open('listproxy.txt').read()
>>     
>>     def is_bad_proxy(pip):    
>>         try:        
>>             proxy_handler = urllib2.ProxyHandler({'http': pip})        
>>             opener = urllib2.build_opener(proxy_handler)
>>             opener.addheaders = [('User-agent', 'Mozilla/5.0')]
>>             urllib2.install_opener(opener)        
>>             req=urllib2.Request('http://www.yahoo.com')  # <---check
>> whether
>> proxy alive 
>>             sock=urllib2.urlopen(req)
>>         except urllib2.HTTPError, e:        
>>             print 'Error code: ', e.code
>>             return e.code
>>         except Exception, detail:
>>     
>>             print "ERROR:", detail
>>             return 1
>>         return 0
>>     
>>     
>>     for item in proxyList:
>>         if is_bad_proxy(item):
>>             print "Bad Proxy", item
>>         else:
>>             print item, "is working"
> 
> 
> 
> The trick to threads is to create a subclass of threading.Thread, define
> the 'run' function and call the 'start()' method. I find threading quite
> generally useful so I created this simple generic function for running
> things in threads...
> 
> 
> def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):
>     import threading
>     class MyThread ( threading.Thread ):
>        def run ( self ):
> 
>             # Call function
>             if function_args:
>                 result = function(*function_args)
>             else:
>                 result = function()
> 
>             # Call callback
>             if callback:
>                 if callback_args:
>                     callback(result, *callback_args)
>                 else:
>                     callback(result)
> 
>     MyThread().start()
> 
> 
> You need to pass it a test function (+args) and, if you want to get a
> result back from each thread you also need to provide a callback
> function (+args). The first parameter of the callback function receives
> the result of the test function so your callback would loo something
> like this...
> 
> def cb( result, item ):
>     if result:
>         print "Bad Proxy", item
>     else:
>         print item, "is working"
> 
> 
> And your calling loop would be something like this...
> 
> for item in proxyList:
>     run_in_thread( is_bad_proxy, func_args=[ item ], cb, callback_args=[
> item ] )
> 
> 
> Also, you might want to limit the number of concurrent threads so as not
> to overload your system, one quick and dirty way to do this is...
> 
> import time
> if threading.activeCount() > 9: time.sleep(1)
> 
> Note, this is a far from exact method but it works well enough for one
> off scripting use
> 
> Hope this helps.
> 
> 
> Suggestions from hardcore pythonistas on how to my make run_in_thread
> function more elegant are quite welcome also :)
> 
> 
> Roger Heathcote
> -- 
> http://mail.python.org/mailman/listinfo/python-list
> 
> 
Hello :)
thanks for your reply ! 
i will test it now and will comment soon
thanks again 

-- 
View this message in context: http://old.nabble.com/python-proxy-checker-%2Cchange-to-threaded-version-tp26672548p26673953.html
Sent from the Python - python-list mailing list archive at Nabble.com.




More information about the Python-list mailing list