How to test a URL request in a "while True" loop

Brian D briandenzer at gmail.com
Wed Dec 30 20:17:19 EST 2009


On Dec 30, 7:08 pm, MRAB <pyt... at mrabarnett.plus.com> wrote:
> Brian D wrote:
> > Thanks MRAB as well. I've printed all of the replies to retain with my
> > pile of essential documentation.
>
> > To follow up with a complete response, I'm ripping out of my mechanize
> > module the essential components of the solution I got to work.
>
> > The main body of the code passes a URL to the scrape_records function.
> > The function attempts to open the URL five times.
>
> > If the URL is opened, a values dictionary is populated and returned to
> > the calling statement. If the URL cannot be opened, a fatal error is
> > printed and the module terminates. There's a little sleep call in the
> > function to leave time for any errant connection problem to resolve
> > itself.
>
> > Thanks to all for your replies. I hope this helps someone else:
>
> > import urllib2, time
> > from mechanize import Browser
>
> > def scrape_records(url):
> >     maxattempts = 5
> >     br = Browser()
> >     user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:
> > 1.9.0.16) Gecko/2009120208 Firefox/3.0.16 (.NET CLR 3.5.30729)'
> >     br.addheaders = [('User-agent', user_agent)]
> >     for count in xrange(maxattempts):
> >         try:
> >             print url, count
> >             br.open(url)
> >             break
> >         except urllib2.URLError:
> >             print 'URL error', count
> >             # Pretend a failed connection was fixed
> >             if count == 2:
> >                 url = 'http://www.google.com'
> >             time.sleep(1)
> >             pass
>
> 'pass' isn't necessary.
>
> >     else:
> >         print 'Fatal URL error. Process terminated.'
> >         return None
> >     # Scrape page and populate valuesDict
> >     valuesDict = {}
> >     return valuesDict
>
> > url = 'http://badurl'
> > valuesDict = scrape_records(url)
> > if valuesDict == None:
>
> When checking whether or not something is a singleton, such as None, use
> "is" or "is not" instead of "==" or "!=".
>
> >     print 'Failed to retrieve valuesDict'
>
>

I'm definitely acquiring some well-deserved schooling -- and it's
really appreciated. I'd seen the "is/is not" preference before, but it
just didn't stick.

I see now that "pass" is redundant -- thanks for catching that.

Cheers.



More information about the Python-list mailing list