[Tutor] Error 403 when accessing wikipedia articles?
Alan Gauld
alan.gauld at btinternet.com
Sat Oct 27 10:19:21 CEST 2007
"Alex Ryu" <ryu.alex at gmail.com> wrote
> I'm trying to use python to automatically download and process a
> (small)
> number of wikipedia articles. However, I keep getting a 403
> (Forbidden
> Error), when using urllib2:
FWIW I had a similar problem in trying to use Google to illustrate
the use of urlib2 in my tutorial. It seems some wevb sites implement
measures to prevent robotic access. I assume you could spoof your
browsers characteristics and fool the system but I tend to take the
view that if the owner doesn't like robots then I'd better respect
that, so I haven't tried.
All of which reminds me that I really need to finish writing
that topic! :-)
> File "G:\Python25\lib\urllib2.py", line 499, in http_error_default
> raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
> HTTPError: HTTP Error 403: Forbidden
>
> Now, when I use urllib instead of urllib2, something different
> happens:
>
> from 98.195.188.89 via sq27.wikimedia.org (squid/2.6.STABLE13) >to
>>()<br/>\nError: ERR_ACCESS_DENIED, errno [No Error] at Sat, 27 Oct
>>2007
HTH,
--
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld
More information about the Tutor
mailing list