urllib2.HTTPError: HTTP Error 204: NoContent

Mark Sapiro slash_dev_slash_null_2000 at yahoo.com
Sun Oct 19 19:09:56 EDT 2008


On Oct 19, 9:49 am, Philip Semanchuk <phi... at semanchuk.com> wrote:
> On Oct 19, 2008, at 6:13 AM, silk.odyssey wrote:
>
> > I am getting the following error trying to download an html page using
> > urllib2.
>
> > urllib2.HTTPError: HTTP Error 204: NoContent
>
> > The url is of this type:
>
> >http://www.amazon.com/gp/offer-listing/B000KJX3A0%3FSubscriptionId%3D...
>
> > I can open it in my browser without problems.Any ideas on a solution?
>
> Are you changing the user-agent? Some sites sniff user agents and  
> return different results to browsers than to suspected bots.


I tried it.

>>> import urllib2
>>> url = 'http://www.amazon.com/gp/offer-listing/B000KJX3A0%3FSubscriptionId%3D183VXJS74KNQ89D0NRR2%26tag%3Dws%26linkCode%3Dxm2%26camp%3D2025%26creative%3D386001%26creativeASIN%3DB000KJX3A0'
>>> op = urllib2.urlopen(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/urllib2.py", line 121, in urlopen
    return _opener.open(url, data)
  File "/usr/lib/python2.5/urllib2.py", line 380, in open
    response = meth(req, response)
  File "/usr/lib/python2.5/urllib2.py", line 491, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.5/urllib2.py", line 418, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.5/urllib2.py", line 353, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.5/urllib2.py", line 499, in
http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 204: NoContent
>>> headers = {}
>>> headers['User-Agent'] = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3'
>>> ro = urllib2.Request(url, None, headers)
>>> op = urllib2.urlopen(ro)
>>> page = op.read()
>>> page
 (lots of HTML)

So the answer is as Philip suggests - amazon.com doesn't like 'Python-
urllib/2.5' as a User-Agent. You have to give it something that looks
like a browser.

--
(for email use this address please - you can figure it out)

Mark Sapiro mark at msapiro net       Any clod can have the facts;
San Francisco Bay Area, California    having opinions is an art. -
                                      C. McCabe, The Fearless
Spectator



More information about the Python-list mailing list