simple httplib and urllib timeout question

Georg Mischler schorsch at schorsch.com
Thu Apr 13 10:23:01 CEST 2000


Michal Wallace wrote:

> there's an example script called blogbot.py there, too.. it uses
> timeout.py and another hack with signal.alarm(), but neither really
> works all that well..

What exactly is your problem with signal.alarm() ?
It works very fine with me. Of course, I haven't tried this
on any M$ system, and it's not the right approach for
concurrent connections.
What I do is fairly simple:


def alarm_handler(signum, frame):
    '''what to do when an alarm signal reaches us'''
    raise socket.error, 'connection timed out'

USER_AGENT = 'DemoBot/0.0.7'
FETCH_TIMEOUT = 60 # in seconds

def fetch_request(req, host, path):
    '''fetch a page online with error handling and timeout,
       req is either 'GET' or 'POST' '''
    path = urllib.quote(path)
    if req == 'POST':
        path, form = string.split(path, '?')
    signal.signal(signal.SIGALRM, alarm_handler)
    signal.alarm(FETCH_TIMEOUT)
    try:
        connection = httplib.HTTP(host)
        connection.putrequest(req, path)
        connection.putheader('Host', host)
        connection.putheader(USER_AGENT)
        connection.putheader('Accept', 'text/html')
        if req == 'POST':
            connection.putheader('Content-Type',
                'application/x-www-form-urlencoded')
            connection.putheader('Content-Length', str(len(form)))
        connection.endheaders()
        if req == 'POST':
            connection.send(form)
        errcode, errmsg, headers = connection.getreply()
    finally: signal.alarm(0) # reset timeout
    return connection, errcode, errmsg, headers


The calling function will then read from the connection
and close it again. It will also catch any network related
exceptions, including the timeout.

All this assumes that there is only one connection in place
at any time, and you just don't want to wait overly long for
unresponsive hosts. If you need concurrent connections,
then asyncore is more powerful. You will have to maintain
a lot of state, but once you understand how all the different
parts of ansyncore work together it should be fairly easy to
get the individual timeouts to do the right thing.


Have fun!

-schorsch

--
Georg Mischler  --  simulations developer  --  schorsch at schorsch.com
+schorsch.com+  --  lighting design tools  --  http://www.schorsch.com/


Sent via Deja.com http://www.deja.com/
Before you buy.



More information about the Python-list mailing list