n00b confusion re: local variable referenced before assignment error
Diez B. Roggisch
deets at nospam.web.de
Fri Jun 19 12:30:48 EDT 2009
Wells Oliver schrieb:
> Writing a class which essentially spiders a site and saves the files
> locally. On a URLError exception, it sleeps for a second and tries again
> (on 404 it just moves on). The relevant bit of code, including the
> offending method:
>
> class Handler(threading.Thread):
> def __init__(self, url):
> threading.Thread.__init__(self)
> self.url = url
>
> def save(self, uri, location):
> try:
> handler = urllib2.urlopen(uri)
> except urllib2.HTTPError, e:
> if e.code == 404:
> return
> else:
> print "retrying %s (HTTPError)" % uri
> time.sleep(1)
> self.save(uri, location)
> except urllib2.URLError, e:
> print "retrying %s" % uri
> time.sleep(1)
> self.save(uri, location)
>
> if not os.path.exists(os.path.dirname(location)):
> os.makedirs(os.path.dirname(location))
>
> file = open(location, "w")
> file.write(handler.read())
> file.close()
>
> ...
>
> But what I am seeing is that after a retry (on catching a URLError
> exception), I see bunches of "UnboundLocalError: local variable
> 'handler' referenced before assignment" errors on line 38, which is the
> "file.write(handler.read())" line..
Your code defines the name handler only if the urllib2.urlopen is
successful. But you try later to access it uncoditionally, and of course
that fails.
You need to put the file-stuff after the urlopen, inside the try-except.
Also note that python has no tail-recursion-optimization, so your method
will recurse and at some point exhaust the stack if there are many errors.
You should consider writing it rather as while-loop, with breaking out
of it when the page could be fetched.
Diez
More information about the Python-list
mailing list