liburl cant load webpage with Javascript

John J. Lee jjl at pobox.com
Sun May 16 08:51:12 EDT 2004


gatti at dsdata.it (Lorenzo Gatti) writes:
> Uwe Mayer <merkosh at hadiko.de> wrote in message news:<c80oqk$t7$1 at news2.rz.uni-karlsruhe.de>...
[...]
> > I had a closer look at the html source and discovered a lot of Javascript,
> > including Cookies.
[...]
> Mozilla is a web browser, and it implements cookies, DOM for HTML
> pages, and a Javascript interpreter with objects representing browser
> automation.
> It's unlikely and inappropriate for low level HTTP implementations
> like wget and liburl to have that kind of support for advanced web
[...]

JavaScript support is rare, but many libraries and tools support
cookies (including wget and my library, ClientCookie -- essentially a
drop-in replacement for urllib2).  For JS, see my FAQ here (under
"Embedded script is messing up my web-scraping. What do I do?"):

http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html


> In the specific case of "IOError: connection refused, Error Code 111",
> however, the failure seems to happen at a lower protocol level: wrong
[...]

Right.


John



More information about the Python-list mailing list