request: web site copy utility
Simon B.
brunns at beer.com
Tue Jun 20 07:24:53 EDT 2000
In article <393207A4.C04E6BDB at earthlink.net>,
Simon <tomega at earthlink.net> wrote:
> websucker.py, while an awesome utility, doesn't actually suck the
entire
> site. For instance, it will pull the image files on a page if those
> images are statically SRC-ed. If there is an onmouseover directive,
for
> instance, which changes the image source to another file, it will not
> grab the other file, even if it lies in the same directory as the
first
> image.
I have problems with some sites (see <http://publib.boulder.ibm.com/cgi-
bin/bookmgr/BOOKS/QBKAQV00/CCONTENTS> for example) where the URLs are
fully qualified, rather than relative. At least I *think* that that's
what the problem is. Oh, and java applets aren't sucked, either.
I've had a quick look into the websucker source, but I haven't spotted
anything yet. I'll keep looking, but can anyone help me out here?
--
Simon B
Sent via Deja.com http://www.deja.com/
Before you buy.
More information about the Python-list
mailing list